Linux, MySQL

How to restore debian-sys-maint MySQL Maria user

I had an issue where I wiped this user out accidentally in a galera cluster on Ubuntu 14.04. This is how to add ‘debian-sys-maint’@’localhost’ user back into your MySQL server and then enjoy all the ‘benefits’ once more..

You can then verify the user with SELECT * from mysql.user\G

Make sure that the password in /etc/mysql/debian.conf matches the password below as THE_PASSWORD.

use mysql;
INSERT INTO `user` (
	`Host`,
	`User`,
	`Password`,
	`Select_priv`,
	`Insert_priv`,
	`Update_priv`,
	`Delete_priv`,
	`Create_priv`,
	`Drop_priv`,
	`Reload_priv`,
	`Shutdown_priv`,
	`Process_priv`,
	`File_priv`,
	`Grant_priv`,
	`References_priv`,
	`Index_priv`,
	`Alter_priv`,
	`Show_db_priv`,
	`Super_priv`,
	`Create_tmp_table_priv`,
	`Lock_tables_priv`,
	`Execute_priv`,
	`Repl_slave_priv`,
	`Repl_client_priv`,
	`Create_view_priv`,
	`Show_view_priv`,
	`Create_routine_priv`,
	`Alter_routine_priv`,
	`Create_user_priv`,
	`ssl_type`,
	`ssl_cipher`,
	`x509_issuer`,
	`x509_subject`,
	`max_questions`,
	`max_updates`,
	`max_connections`,
	`max_user_connections`
)
VALUES (
	'localhost',
	'debian-sys-maint',
	password('THE_PASSWORD'),
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'Y',
	'N',
	'N',
	'N',
	'N',
	'N',
	'',
	'',
	'',
	'',
	0,
	0,
	0,
	0
);
FLUSH PRIVILEGES;
Databases, MySQL

MySQL

Work in Progress

The basic MySQL component architecture:



Locking

Read locks on a resource are shared, mutually nonblocking. Many clients can read a resource simultaneously and don’t interfere with each other.
Write locks are exclusive – they block read and write locks.

Each lock operation has an overhead. Lock management is important in storage engine design.

Table Locks

Basic and with the lowest overhead. This locks the whole table. If a client wishes to write to a table (insert, delete, update), it obtains a write lock. This keeps all other read and write operations in a queue. When there are no write locks, clients can set read locks and, as aforementioned, don’t conflict with other client read locks. READ LOCAL table locks allow concurrent write operations of some sorts.

Write locks have a higher priority than read locks; a request for a write lock will jump to the front of the lock queue if read locks are already queued. ALTER TABLE will always use a table lock, regardless of which storage engine is in place.

Row Locks

This type of lock offers the best concurrency but carries the highest overhead. Row-level locks are available within InnoDB and Falcon storage engines, but not exclusive to. Row locks are part of the storage engine, not the server. The server is unaware of storage engine locks and each storage engine implements locks in different ways.

Deadlocks

When multiple transactions are holding or requesting locks on the same resources creating a cycle of dependencies. Deadlocks generally occur if transactions try to lock resources in a different order. InnoDB can detect and error return on a circular dependency thus preventing what appears to be slow query times.

Transaction Logging

Instead of constantly updating the disk tables on each change request, the storage engines change the table data in RAM. It will then write the change records to the transaction log, this makes for more sequential disk IO rather than completely random and thus decreases seek and access time to the disk hardware. Disk writes are usually twice as expensive but if there’s a crash, the transaction log is usually complete and will provide a recovery path.

Transactions

A transaction is a group of SQL queries that are treated as a single unit of work. If one command fails, none of them are run.

The Storage Engines

MyISAM
This is MySQL’s default storage engine. It provides a good compromise between performance and offers full-text indexing, compression, GIS etc. It does NOT support transactions or row-level locks.
MyISAM typically stores each table in two files: data and index. The format is platform independent and is therefore portable between system architectures.
MyISAM has entire table locking. Clients obtain read locks on all tables required for read actions. Writes obtain exclusive write locks. It is possible to insert new rows into the table while select queries are running, though.

CHECK TABLE and REPAIR TABLE commands can manually check/repair table errors. Myisamchk command-line tools can do the same offline.
Indexing is possible on the first 500 characters of TEXT and BLOB columns in these tables. Full text indexing is supported which will allow complicated search options straight in the database.
Delayed Key Writes – tables marked with DELAY_KEY_WRITE create option don’t write changing index data to disk after each query. As mentioned before, this will store changes in the in-memory key buffer and flush blocks to disk once the buffer is full or the table is closed.
This can increase performance on heavily used tables that have a lot of write operations. Indexes will usually be corrupted after a server crash and need repaired. Scripting with myisamchk or the other commands can satisfy this requirement.

MyISAM Merge
The combination of multiple MyISAM tables into one virtual table.

InnoDB
The most popular engine for transactional storage. Performance and automatic crash recovery make it a popular choice for most needs.

InnoDB stores data in aseries of datafiles known as a tablespace. InnoDB manages this tablespace itself and InnoDB can store each table data and indexes into separate files, it can also use raw disk partitions for building tablespace.

InnoDB achieves a high level of concurrency and implements all four SQL isolation levels.
InnoDB provides fast primary key lookups but non primary indexes contain the primary column, so it’s a good idea to keep the primary key lengths short. InnoDB does not compress indexes.

** Scalability issues with concurrency tuning ***

InnoDB includes predictive read-ahead data fetching from disk, an adaptive has that builds hash indexes in memory for fast lookups and an insert buffer.

Memory Engine

HEAP tables can be used for fast data access to static data. The can be 10-100x faster than MyISAM and all data is stored in RAM thus eliminating disk IO. The data is lost between server restarts even though the table structure is kept.

According to some research, good uses for memory tables:

  • Lookup or mapping tables such as a table that maps postal codes to state names.
  • Caching the results of periodically aggregated data
  • Intermediate results when analyzing data

Memory tables are supportive of hash indexing which can be really quick.
Memory tables use table locking therefore have a low write concurrency. They are unsupportive of TEXT or BLOB column types and only support fixed-size rows so they store VARCHARS as CHARS which is wasteful of memory in some scenarios.

MySQL uses memory engine for intermediate results that aren’t too large for a memory table. Otherwise MyISAM is used.

Archive Engine
This only supports INSERT and SELECT queries and doesn’t support indexing. Great for logging and data mining or where fast insert queries are required – on a replication master. Replication slaves can use differing storage engines for the same data tables, which is nice.

Archive also supports row locking and has a buffer for high-concurrency inserts. It stops a SELECT statement after it collected the number of raws that existed in the table at the start of the query. It’s basically a storage engine that’s optimized for high speed inserting and compressed storage; funnily, archive use.

CSV
Treats CSV files as tables without indexing. Allows for in/out server file transactions while the server is hot. Can be used for logging and for data import/exporting.

Federation
Data is not stored locally. Each federated table refers to a table on a remote MySQL server. Network connectivity is used for all options. Can be useful for single-row lookups by primary key or for INSERT queries affecting a remote server. It generally does not perform well.

Blackhole Engine
No storage. Every insert is discarded  but all queries are written to logs as usual. Audit/replication purposes.

NDB Cluster
An old Sony Ericsson engine. Originally designed for HA and performance. It’s quite similar to RAID for disk block tech. It stores parity copies of data fragments across nodes. The equivalent of one physical server is dedicated to each node for redundancy.
Management nodes retrieve central configuration and monitor/control the cluster. All servers intercommunicate and  geographically separate networks are not recommended for this (low latency is good).
NDB performs joins at the MySQL server level and not at the storage engine. All NDB data must be obtained over the network and can result in slow complex joins. Single table lookups are very fast because each node provides fractional responses.

Generally not used for traditional applications.

Falcon
MVCC. Keeps transactions wholly in memory. Rollbacks and recoveries can be very fast.
*** I need to read more about the Falcon engine 2010 ***

solidDB

PBXT (Primebase XT)

Maria

Selecting a Storage Engine

Granular locking?
Transactions?
Chosen on a table by table basis!

Transactions
InnoDB one of the most stable and industry-proven. MyISAM good if no transactions and logging works well here.

Concurrency
INSERT/READ = MyISAM
Mixture = Row level locking capable engine for the win.

Backups
Using multiple storage engines can increase the complexity of performance tuning and server backups (hot backups).


Administration

mysqladmin – Create/delete databases/password management.
mysqldump – Create a backup of a database. It will dump to an .sql command file.
MySQL Administrator (deprecated)
MySQL Query Browser (deprecated)
phpmyadmin
MySQL Workbench (replaces the deprecated apps too)

MySQL HA

Fault tolerant architecture, avoiding single points of failure and fail-over/fail-back.

Data sharding/partitioning – Horizontally partition data over nodes; only a certain # of the user-base affected by failures.

MySQLProxy?

MySQL Replication
GOOD: Easy to implement. Low performance impact.
BAD: Asynchronous. Failback can be difficult. Needs additional components.

MySQL with DRBD/ZFS/AVS
GOOD: Easy, low cost. Synchronous.
BAD: Certain performance impact. Limited datasize (by licensing for DRBD) and transaction rates.

MySQL with shared storage
GOOD: Performance. Hardware management.
BAD: Cost of implementing storage networks.

MySQL Cluster
GOOD: Very good performance. Self contained. Short failover time. Software only.
BAD: Needs multiple physical servers. Not optimized for all applications.

Master/Master / Multi-Master
Master/Master with two active nodes would give a seamless switch with a good load balancer.
Gives ability to perform schema changes online.
Higher performance AND availability with both nodes up.
Can limit the use of production slaves.
One application entry point with load balancing.

MySQL Cluster
HA if correctly implemented.
Complicated.

Linux HA
ARP grab of virtual IP will cause a momentary drop in service.
Shared storage possible with SAN.
One instance, two systems. One active, one standby.

MySQL Business Continuity

Depends on the HA solution!

Classic way is to use mysqldump. It will return an SQL file full of commands for regenerating the tables and data. Considered rather slow method of backing up but is extremely portable. Can be used for migrations.

Operating system level backups. MySQL needs to be stopped to ensure data consistency. mysqlhotcopy can do it while online, but it will cause a read lock and can only operate when there are no current write locks; and has to be run on the database system.

Replication: DRBD, slaves and so on. Any backup operations could run on a slave or a ‘copy’.

MySQL can’t do hot backups without table locking (NEED TO MAKE SURE THIS IS STILL TRUE). InnoDB tables with InnoDB Hot Backup *can*, however.


Caching.
Detach components and build a modular and scalable design.
Plan for replication and/or sharding.
Append EXPLAIN before SELECTS where possible.
EXPLAIN returns the execution plan chosen by the optimizer.

Generic System Bottlenecks

  • Disk seek
  • Disk read/write throughput
  • CPU cycles
  • Memory bandwidth

Benchmarking
Before tweaking or doing anything to the system, it’s important to have a good benchmarking and testing plan in place. You need to get a baseline performance meter and iterate your tests between any changes to show any differences. This is the obvious part.

Write good documentation that allows for change request (:P) tracking and performance delta. It’s good to have a nice development environment that you can also develop your tests and processes without being attributed to random Internet factors. Of course, live performance monitoring is important but this is benchmarking against a change request and should reasonably be done within a QA environment.

Whilst this contains information about generic web and database testing, it’s also important to consider and develop methods of testing your proprietary application too. Everything that can, should be automated.

When benchmarking your database systems, take a record of everything.
Dump schema
MySQL config files
Operating system configurations.
Iterate your tuning step-by-step and have a complete lifecycle for your analysis.

Reasons to benchmark

  • Measure how the application performs.
  • Validate scalability by simulating high load (100-1000x user increase).
  • Plan for growth – hardware, network capacity and project future load/cost.
  • Test application tolerance of environmental changes – Concurrency spike, configuration management (soft/hard), storage engine changes.

Two primary benchmark strategies are known as full-stack and single-component benchmarking. I.e. application suite as a whole or just MySQL.

Full-Stack
Testing the entire environment, inclusive of web server, proprietary application and the database. This is generally the most complete picture of what affects the customer experience.
MySQL is not always the problem.
The full stack can help identify each component’s cache behavior.

Single-Component
Compare schema/query changes
Benchmark specific problems
Change management in a shorter time

Metrics

Transactions per time unit
Classic database benchmark. TPC-C is a standardized benchmark. Sometimes referred to as throughput but online transaction processing (OLTP) performance is considered very suitable for interactive multiuser applications. The unit is transactions per second.

Latency
Measurement of total time a task requires. The time unit selection is dependent on the application. Extrapolation of minimum, average and maximum response times/latency is the common use.
95% percentile of 5ms, task finishes in less than 5ms 95% of the time. Percentile response times are the most common way of measuring maximum response time.
Graph these benchmark results as line or scatter plot and use a time period. It’s a good idea to monitor these regularly and alert on them.
One example I read about was a situation where a checkpoint is run hourly, the 95th percentile response does not show the spikes and the results hide the problem of stalling system/no completed transactions during the checkpoint. Graphing the results will show this pattern based spike.

Scalability
Performance is generally measured as a throughput/response time combination and workload varies with changing of user counts, concurrency, size and configuration (hard/soft). It’s pretty much an arbitrary metric but can give educated guessing and analysis for capacity planning against a company success projection. It can also show design weaknesses.

Benchmarking for concurrency increases against a consistent response time would show a design flaw in scalability for the purposes of concurrency.
Batch jobs can cause interactive queries to suffer, etc.

Concurrency
When measuring a web server, concurrency is a used metric but HTTP is a stateless connection and it isn’t really considered concurrency for the most part. A database concurrent hit != http server hit count. For the web server, measurement of requests per second the users generate at peak time is a good measurement of web server concurrency.

For example: A high number of fairly static web page hits will cause a lower number of concurrent database connections than say PHP/Ruby with persistent connections.

What each concurrent connection is doing is also very important. Having an established connection to the database server but running no or little quantities of queries is a possible scenario. The important metric is called working concurrency. This is the number of threads that are doing work simultaneously. Measurement of performance drops at higher working-concurrency rates is key.

MySQL can limit concurrency to the server base with techniques such as application queuing. Concurrency is completely different from latency and scalability: it’s a property of how you set up the benchmark. Measure the application’s performance at various levels of concurrency versus simply the concurrency rate that the application can achieve.

Performance is important based upon what the proprietary application does. That may be an interactive website with forms, a game, a forum.. the list goes on.

Tools

SysBench – http://sysbench.sourceforge.net/

SysBench is a tool to benchmark the system performance. It allows you to emulate/’evaluate’ OS scenarios that can give a realistic heavy database load. It kinda gives an easy way to benchmark the capabilities of a system without having to create complicated real-case procedures within your database software.

This will allow you to test:

  • File I/O performance
  • Scheduler performance
  • Memory allocation and transfer speed
  • POSIX threads implementation performance
  • OLTP benchmark – database server performance

The OLTP test will create tables within the MySQL server and fill it with records. It can then create client threads and benchmark requests to an N quantity.

mysqlslap – http://dev.mysql.com/doc/refman/5.1/en/mysqlslap.html

MySQLslap will emulate client load for the MySQL server and report timings for each stage. It works as if multiple clients are accessing the server and is available as of MySQL 5.1.4. This tool is kinda like the Apache benchmarks.

The user can pass parameters that will contain SQL statements (or a file containing).
mysqlslap runs in three stages:

Create schema, table, and optionally any stored programs or data you want to use for the test. This uses a single client connection.
Run the load test. This stage can use many client connections.
Clean up (disconnect, drop table if specified). This stage uses a single client connection.

ab – Apache Bench – http://httpd.apache.org/docs/2.0/programs/ab.html

When using a database server such as MySQL, the primary use would be for web applications/sites. It’s a good idea to consider benchmarking your web server within the routine of your database. One running great and another in a poorly state will be no use, so I’ve included ab.

This will allow you to show mainly how many reqs/sec Apache will be capable of serving.

http_load

This tool is similar to ab. You can create an input file with many URLs and it will load them at random. It can also execute based upon a time interval you specify.

JMeter

Java app that will allow the user to load another application and measure plus analyze the performance. It includes graphing and a GUI with the ability to record/replay results offline.

Database Test Suite

Opensource test suite used for implementing TPC-C OLTP testing.

SuperSmack – http://www.vegan.net/tony/supersmack/

Super Smack is a benchmarking, stress testing and load generation tool for MySQL (it handles PostgreSQL too).
It is possible to run Super Smack against a database on the same system (either through local socket or TCP/IP, or remotely over TCP/IP.
For network capable applications, it’s normally benificial to run the benchmarks remotely so that the load imposed on the database system is only from the database workload itself, and not the load created by running the actual benchmark.
According to Tony Bourke (www.vegan.net), there are some considerations:
TCP/IP versus Socket
With MySQL, there’s about a 25% performance gain in terms of queries per second when running through the local sockets compared to TCP/IP. You can test the comparison by utilizing the local IP versus the loopback IP.
Network Overload
Super Smack creates simple queries and each average packet size is approximately 100 bytes.

mybench – http://jeremy.zawodny.com/mysql/mybench/

Mybench is a benchmarking framework written in Perl. It comes with an example script that needs to be customized to your use. It seems rather simplistic but it’s nice to not have to reinvent the wheel.
It’s also referenced to within the  O’Reilly Media High Performance MySQL book.

Profiling

Profiling is generally done within a productive operating environment. Real-time and past data analysis are two different but equally important steps.
Out-with the SQL environment:
Memory
CPU
Disk IO
Network and system health

.. are simple but key contributing factors to performance degradation.

The Tools

mysqltuner – http://blog.mysqltuner.com/

MySQLtuner is a Perl script that provides a picture of MySQL’s health. It will also recommend how to increase efficiency and performance. It’s pretty automated, a kinda robot DBA.

>> MySQLTuner 0.9.0 - Major Hayden
>> Bug reports, feature requests, and downloads at http://mysqltuner.com/
>> Run with '--help' for additional options and output filtering
Please enter your MySQL administrative login: secret
Please enter your MySQL administrative password: more secret

-------- General Statistics --------------------------------------------------
[OK] You have the latest version of MySQLTuner
[OK] Currently running supported MySQL version 5.0.32-Debian_7etch4-log
[OK] Operating on 32-bit architecture with less than 2GB RAM

-------- Storage Engine Statistics -------------------------------------------
[--] Status: +Archive -BDB -Federated -InnoDB -ISAM -NDBCluster
[--] Data in MyISAM tables: 71M (Tables: 548)

-------- Performance Metrics -------------------------------------------------
[--] Up for: 109d 22h 8m 1s (69M q [7.283 qps], 1M conn, TX: 3B, RX: 3B)
[--] Reads / Writes: 98% / 2%
[--] Total buffers: 2.6M per thread and 58.0M global
[OK] Maximum possible memory usage: 320.5M (62% of installed RAM)
[OK] Slow queries: 0% (39/69M)
[OK] Highest usage of available connections: 63% (63/100)
[OK] Key buffer size / total MyISAM indexes: 16.0M/65.0K
[OK] Key buffer hit rate: 100.0%
[OK] Query cache efficiency: 80.6%
[!!] Query cache prunes per day: 30287
[OK] Sorts requiring temporary tables: 0%
[!!] Temporary tables created on disk: 59%
[OK] Thread cache hit rate: 99%
[!!] Table cache hit rate: 0%
[OK] Open file limit used: 12%
[OK] Table locks acquired immediately: 99%

-------- Recommendations -----------------------------------------------------
General recommendations:

 

  • Enable the slow query log to troubleshoot bad queries
  • When making adjustments, make tmp_table_size/max_heap_table_size equal
  • Reduce your SELECT DISTINCT queries without LIMIT clauses
  • Increase table_cache gradually to avoid file descriptor limits

Variables to adjust:

  • query_cache_size (> 16M)
  • tmp_table_size (> 32M)
  • max_heap_table_size (> 16M)
  • table_cache (> 64)

 

SHOW

SHOW PROCESSLIST | STATUS | INNODB STATUS

Show processlist will display which threads are running. PROCESS privilege is required to see all threads. Utilizing the FULL keyword will show more than the first 100 characters of each statement.
Generally, this statement is useful in cases of the ‘too many connections’ error message. SUPER privilege users should always be able to connect and check the system. Threads can be killed with a KILL statement.

An example of SHOW PROCESSLIST would be:

mysql> SHOW FULL PROCESSLIST\G
*************************** 1. row ***************************
Id: 1
User: system user
Host:
db: NULL
Command: Connect
Time: 1030455
State: Waiting for master to send event
Info: NULL


mytop - http://jeremy.zawodny.com/mysql/mytop/
Mytop is a console app for monitoring threads and overall performance of MySQL. It's like top for *nix but gives a real-time view of the database system. Lets you monitor queries.

  MySQL on localhost (3.22.32)              up 3+23:14:20 [23:54:52]
   Queries Total: 617            Avg/Sec: 0.00  Now/Sec: 0.05  Slow: 0
   Threads Total: 1         Active: 1     Cached: 0
   Key Efficiency: 88.38%  Bytes in: 0  Bytes out: 0

Slow Query Log
 log_slow_queries=/var/lib/mysql/slow-queries.log
 long_query_time=2 Use mysqldumpslow.
 Can log directly to a table and doesn't require a server restart.
 SET GLOBAL SLOW_QUERY_LOG = { ON | OFF }
 http://dev.mysql.com/doc/refman/5.1/en/logtables.html
 I need to get more familiar with EXPLAIN.
 MyTop will also help catch locking and long-running queries in real-time. This will allow you to create simple scripts to manage these... kinda like when Perforce experiences heavy database locking. You can script automatic kills, although it's not highly recommended with Perforce (I imagine MySQL too). Common Problems Poor or non-existent indexing
 Inefficient or bloated schema design
 Bad coding practices
 Server variables not tuned correctly
 Bottlenecks as mentioned previously Guidelines (according to MySQL.com)
 Poor or missing index is the fastest way to kill a system
 Use the smallest data types necessary
 Normalize first

MySQL Monitoring

Spotlight on MySQL – http://www.quest.com/spotlight-on-mysql/

Spotlight® on MySQL diagnoses MySQL problems in real time by graphically displaying all database activity in an intuitive user interface, enabling you to respond quickly to issues that need attention.

With Spotlight on MySQL, DBAs or developers running Windows can drill down to locate in-depth information on the source of a performance issue. And because the run-time nature of your MySQL databases is unique, Spotlight calibrates itself to a baseline of normal database activity.

This allows Spotlight to automatically establish the speed of the internal flows and other activity indicators. When a threshold is met, Spotlight notifies you using a visual or audible alert.

Spotlight on MySQL supports MySQL versions 5.0.3 or later that employ the InnoDB storage engine.

Features and Benefits

Spotlight® on MySQL

Complete visual representation of the MySQL database allowing you to observe actual activity, including process flows and configuration settings in real time from a single display

  • Quickly identifies performance bottlenecks and problem areas in complex data sharing environments using flows, graphs and visual icons
  • Alerts you, with an audio or visual alarm, when key metric thresholds are triggered and if any MySQL component has exceeded a critical threshold before the issues impacts end users
  • Reveals essential details of problem areas for quick diagnosis
  • Drills down on operating system metrics like CPU, I/O, memory and processes Alerts DBAs with an audio or visual alarm if any MySQL component has exceeded a critical threshold

Best-ish case would be on a 15k SAS 6gb/s at approx 3.4ms seek. Would allow an average of 290.4/second