MySQL数据库

开发平台：

Visual C++

manual.txt：源码内容

Any index that doesn't span all `AND' levels in the `WHERE' clause is
not used to optimize the query. In other words: To be able to use an
index, a prefix of the index must be used in every `AND' group.
The following `WHERE' clauses use indexes:
... WHERE index_part1=1 AND index_part2=2 AND other_column=3
... WHERE index=1 OR A=10 AND index=2 /* index = 1 OR index = 2 */
... WHERE index_part1='hello' AND index_part_3=5
/* optimized like "index_part1='hello'" */
... WHERE index1=1 and index2=2 or index1=3 and index3=3;
/* Can use index on index1 but not on index2 or index 3 */
These `WHERE' clauses do *NOT* use indexes:
... WHERE index_part2=1 AND index_part3=2 /* index_part_1 is not used */
... WHERE index=1 OR A=10 /* Index is not used in both AND parts */
... WHERE index_part1=1 OR index_part2=10 /* No index spans all rows */
Note that in some cases *MySQL* will not use an index, even if one
would be available. Some of the cases where this happens are:
* If the use of the index would require *MySQL* to access more than
30 % of the rows in the table. (In this case a table scan is
probably much faster, as this will require us to do much fewer
seeks). Note that if such a query uses `LIMIT' to only retrieve
part of the rows, *MySQL* will use an index anyway, as it can much
more quickly find the few rows to return in the result.
Speed of Queries that Access or Update Data
===========================================
First, one thing that affects all queries: The more complex permission
system setup you have, the more overhead you get.
If you do not have any `GRANT' statements done, *MySQL* will optimize
the permission checking somewhat. So if you have a very high volume it
may be worth the time to avoid grants. Otherwise more permission check
results in a larger overhead.
If your problem is with some explicit *MySQL* function, you can always
time this in the *MySQL* client:
mysql> select benchmark(1000000,1+1);
+------------------------+
| benchmark(1000000,1+1) |
+------------------------+
| 0 |
+------------------------+
1 row in set (0.32 sec)
The above shows that *MySQL* can execute 1,000,000 `+' expressions in
0.32 seconds on a `PentiumII 400MHz'.
All *MySQL* functions should be very optimized, but there may be some
exceptions, and the `benchmark(loop_count,expression)' is a great tool
to find out if this is a problem with your query.
Estimating Query Performance
----------------------------
In most cases you can estimate the performance by counting disk seeks.
For small tables, you can usually find the row in 1 disk seek (as the
index is probably cached). For bigger tables, you can estimate that
(using B++ tree indexes) you will need: `log(row_count) /
log(index_block_length / 3 * 2 / (index_length + data_pointer_length)) +
1' seeks to find a row.
In *MySQL* an index block is usually 1024 bytes and the data pointer is
usually 4 bytes. A 500,000 row table with an index length of 3 (medium
integer) gives you: `log(500,000)/log(1024/3*2/(3+4)) + 1' = 4 seeks.
As the above index would require about 500,000 * 7 * 3/2 = 5.2M,
(assuming that the index buffers are filled to 2/3, which is typical)
you will probably have much of the index in memory and you will probably
only need 1-2 calls to read data from the OS to find the row.
For writes, however, you will need 4 seek requests (as above) to find
where to place the new index and normally 2 seeks to update the index
and write the row.
Note that the above doesn't mean that your application will slowly
degenerate by N log N! As long as everything is cached by the OS or SQL
server things will only go marginally slower while the table gets
bigger. After the data gets too big to be cached, things will start to
go much slower until your applications is only bound by disk-seeks
(which increase by N log N). To avoid this, increase the index cache as
the data grows. *Note Server parameters::.
Speed of `SELECT' Queries
-------------------------
In general, when you want to make a slow `SELECT ... WHERE' faster, the
first thing to check is whether or not you can add an index. *Note
*MySQL* indexes: MySQL indexes. All references between different tables
should usually be done with indexes. You can use the `EXPLAIN' command
to determine which indexes are used for a `SELECT'. *Note `EXPLAIN':
EXPLAIN.
Some general tips:
* To help *MySQL* optimize queries better, run `myisamchk --analyze'
on a table after it has been loaded with relevant data. This
updates a value for each index part that indicates the average
number of rows that have the same value. (For unique indexes,
this is always 1, of course.). *MySQL* will use this to decide
which index to choose when you connect two tables with 'a
non-constant expression'. You can check the result from the
`analyze' run by doing `SHOW INDEX FROM table_name' and examining
the `Cardinality' column.
* To sort an index and data according to an index, use `myisamchk
--sort-index --sort-records=1' (if you want to sort on index 1).
If you have a unique index from which you want to read all records
in order according to that index, this is a good way to make that
faster. Note, however, that this sorting isn't written optimally
and will take a long time for a large table!
How MySQL Optimizes `WHERE' Clauses
-----------------------------------
The `WHERE' optimizations are put in the `SELECT' part here because
they are mostly used with `SELECT', but the same optimizations apply for
`WHERE' in `DELETE' and `UPDATE' statements.
Also note that this section is incomplete. *MySQL* does many
optimizations, and we have not had time to document them all.
Some of the optimizations performed by *MySQL* are listed below:
* Removal of unnecessary parentheses:
((a AND b) AND c OR (((a AND b) AND (c AND d))))
-> (a AND b AND c) OR (a AND b AND c AND d)
* Constant folding:
(a<b AND b=c) AND a=5
-> b>5 AND b=c AND a=5
* Constant condition removal (needed because of constant folding):
(B>=5 AND B=5) OR (B=6 AND 5=5) OR (B=7 AND 5=6)
-> B=5 OR B=6
* Constant expressions used by indexes are evaluated only once.
* `COUNT(*)' on a single table without a `WHERE' is retrieved
directly from the table information. This is also done for any
`NOT NULL' expression when used with only one table.
* Early detection of invalid constant expressions. *MySQL* quickly
detects that some `SELECT' statements are impossible and returns
no rows.
* `HAVING' is merged with `WHERE' if you don't use `GROUP BY' or
group functions (`COUNT()', `MIN()'...).
* For each sub-join, a simpler `WHERE' is constructed to get a fast
`WHERE' evaluation for each sub-join and also to skip records as
soon as possible.
* All constant tables are read first, before any other tables in the
query. A constant table is:
- An empty table or a table with 1 row.
- A table that is used with a `WHERE' clause on a `UNIQUE'
index, or a `PRIMARY KEY', where all index parts are used
with constant expressions and the index parts are defined as
`NOT NULL'.
All the following tables are used as constant tables:
mysql> SELECT * FROM t WHERE primary_key=1;
mysql> SELECT * FROM t1,t2
WHERE t1.primary_key=1 AND t2.primary_key=t1.id;
* The best join combination to join the tables is found by trying all
possibilities. If all columns in `ORDER BY' and in `GROUP BY' come
from the same table, then this table is preferred first when
joining.
* If there is an `ORDER BY' clause and a different `GROUP BY'
clause, or if the `ORDER BY' or `GROUP BY' contains columns from
tables other than the first table in the join queue, a temporary
table is created.
* If you use `SQL_SMALL_RESULT', *MySQL* will use an in-memory
temporary table.
* Each table index is queried, and the best index that spans fewer
than 30% of the rows is used. If no such index can be found, a
quick table scan is used.
* In some cases, *MySQL* can read rows from the index without even
consulting the data file. If all columns used from the index are
numeric, then only the index tree is used to resolve the query.
* Before each record is output, those that do not match the `HAVING'
clause are skipped.
Some examples of queries that are very fast:
mysql> SELECT COUNT(*) FROM tbl_name;
mysql> SELECT MIN(key_part1),MAX(key_part1) FROM tbl_name;
mysql> SELECT MAX(key_part2) FROM tbl_name
WHERE key_part_1=constant;
mysql> SELECT ... FROM tbl_name
ORDER BY key_part1,key_part2,... LIMIT 10;
mysql> SELECT ... FROM tbl_name
ORDER BY key_part1 DESC,key_part2 DESC,... LIMIT 10;
The following queries are resolved using only the index tree (assuming
the indexed columns are numeric):
mysql> SELECT key_part1,key_part2 FROM tbl_name WHERE key_part1=val;
mysql> SELECT COUNT(*) FROM tbl_name
WHERE key_part1=val1 AND key_part2=val2;
mysql> SELECT key_part2 FROM tbl_name GROUP BY key_part1;
The following queries use indexing to retrieve the rows in sorted order
without a separate sorting pass:
mysql> SELECT ... FROM tbl_name ORDER BY key_part1,key_part2,...
mysql> SELECT ... FROM tbl_name ORDER BY key_part1 DESC,key_part2 DESC,...
How MySQL Optimizes `DISTINCT'
------------------------------
`DISTINCT' is converted to a `GROUP BY' on all columns, `DISTINCT'
combined with `ORDER BY' will in many cases also need a temporary table.
When combining `LIMIT #' with `DISTINCT', *MySQL* will stop as soon as
it finds `#' unique rows.
If you don't use columns from all used tables, *MySQL* will stop the
scanning of the not used tables as soon as it has found the first match.
SELECT DISTINCT t1.a FROM t1,t2 where t1.a=t2.a;
In the case, assuming t1 is used before t2 (check with `EXPLAIN'), then
*MySQL* will stop reading from t2 (for that particular row in t1) when
the first row in t2 is found.
How MySQL Optimizes `LEFT JOIN' and `RIGHT JOIN'
------------------------------------------------
`A LEFT JOIN B' in *MySQL* is implemented as follows:
* The table `B' is set to be dependent on table `A' and all tables
that `A' is dependent on.
* The table `A' is set to be dependent on all tables (except `B')
that are used in the `LEFT JOIN' condition.
* All `LEFT JOIN' conditions are moved to the `WHERE' clause.
* All standard join optimizations are done, with the exception that
a table is always read after all tables it is dependent on. If
there is a circular dependence then *MySQL* will issue an error.
* All standard `WHERE' optimizations are done.
* If there is a row in `A' that matches the `WHERE' clause, but there
wasn't any row in `B' that matched the `LEFT JOIN' condition, then
an extra `B' row is generated with all columns set to `NULL'.
* If you use `LEFT JOIN' to find rows that don't exist in some table
and you have the following test: `column_name IS NULL' in the
`WHERE' part, where column_name is a column that is declared as
`NOT NULL', then *MySQL* will stop searching after more rows (for
a particular key combination) after it has found one row that
matches the `LEFT JOIN' condition.
`RIGHT JOIN' is implemented analogously as `LEFT JOIN'.
The table read order forced by `LEFT JOIN' and `STRAIGHT JOIN' will
help the join optimizer (which calculates in which order tables should
be joined) to do its work much more quickly, as there are fewer table
permutations to check.
Note that the above means that if you do a query of type:
SELECT * FROM a,b LEFT JOIN c ON (c.key=a.key) LEFT JOIN d (d.key=a.key) WHERE b.key=d.key
*MySQL* will do a full scan on `b' as the `LEFT JOIN' will force it to
be read before `d'.
The fix in this case is to change the query to:
SELECT * FROM b,a LEFT JOIN c ON (c.key=a.key) LEFT JOIN d (d.key=a.key) WHERE b.key=d.key
How MySQL Optimizes `LIMIT'
---------------------------
In some cases *MySQL* will handle the query differently when you are
using `LIMIT #' and not using `HAVING':
* If you are selecting only a few rows with `LIMIT', *MySQL* will
use indexes in some cases when it normally would prefer to do a
full table scan.
* If you use `LIMIT #' with `ORDER BY', *MySQL* will end the sorting
as soon as it has found the first `#' lines instead of sorting the
whole table.
* When combining `LIMIT #' with `DISTINCT', *MySQL* will stop as
soon as it finds `#' unique rows.
* In some cases a `GROUP BY' can be resolved by reading the key in
order (or do a sort on the key) and then calculate summaries until
the key value changes. In this case `LIMIT #' will not calculate
any unnecessary `GROUP BY''s.
* As soon as *MySQL* has sent the first `#' rows to the client, it
will abort the query.
* `LIMIT 0' will always quickly return an empty set. This is useful
to check the query and to get the column types of the result
columns.
* The size of temporary tables uses the `LIMIT #' to calculate how
much space is needed to resolve the query.
Speed of `INSERT' Queries
-------------------------
The time to insert a record consists approximately of:
* Connect: (3)
* Sending query to server: (2)
* Parsing query: (2)
* Inserting record: (1 x size of record)
* Inserting indexes: (1 x number of indexes)
* Close: (1)
where the numbers are somewhat proportional to the overall time. This
does not take into consideration the initial overhead to open tables
(which is done once for each concurrently running query).
The size of the table slows down the insertion of indexes by N log N
(B-trees).
Some ways to speed up inserts:
* If you are inserting many rows from the same client at the same
time, use multiple value lists `INSERT' statements. This is much
faster (many times in some cases) than using separate `INSERT'
statements.
* If you are inserting a lot of rows from different clients, you can
get higher speed by using the `INSERT DELAYED' statement. *Note
`INSERT': INSERT.
* Note that with `MyISAM' you can insert rows at the same time
`SELECT's are running if there are no deleted rows in the tables.
* When loading a table from a text file, use `LOAD DATA INFILE'. This
is usually 20 times faster than using a lot of `INSERT' statements.
*Note `LOAD DATA': LOAD DATA.
* It is possible with some extra work to make `LOAD DATA INFILE' run
even faster when the table has many indexes. Use the following
procedure:
1. Optionally create the table with `CREATE TABLE'. For example,
using `mysql' or Perl-DBI.
2. Execute a `FLUSH TABLES' statement or the shell command
`mysqladmin flush-tables'.
3. Use `myisamchk --keys-used=0 -rq /path/to/db/tbl_name'. This
will remove all usage of all indexes from the table.
4. Insert data into the table with `LOAD DATA INFILE'. This will
not update any indexes and will therefore be very fast.
5. If you are going to only read the table in the future, run
`myisampack' on it to make it smaller. *Note Compressed
format::.
6. Re-create the indexes with `myisamchk -r -q
/path/to/db/tbl_name'. This will create the index tree in
memory before writing it to disk, which is much faster
because it avoids lots of disk seeks. The resulting index
tree is also perfectly balanced.
7. Execute a `FLUSH TABLES' statement or the shell command
`mysqladmin flush-tables'.
This procedure will be built into `LOAD DATA INFILE' in some future
version of MySQL.
* You can speed up insertions by locking your tables:
mysql> LOCK TABLES a WRITE;
mysql> INSERT INTO a VALUES (1,23),(2,34),(4,33);
mysql> INSERT INTO a VALUES (8,26),(6,29);
mysql> UNLOCK TABLES;
The main speed difference is that the index buffer is flushed to
disk only once, after all `INSERT' statements have completed.
Normally there would be as many index buffer flushes as there are
different `INSERT' statements. Locking is not needed if you can
insert all rows with a single statement.
Locking will also lower the total time of multi-connection tests,
but the maximum wait time for some threads will go up (because
they wait for locks). For example:
thread 1 does 1000 inserts
thread 2, 3, and 4 does 1 insert
thread 5 does 1000 inserts
If you don't use locking, 2, 3, and 4 will finish before 1 and 5.
If you use locking, 2, 3, and 4 probably will not finish before 1
or 5, but the total time should be about 40% faster.
As `INSERT', `UPDATE', and `DELETE' operations are very fast in
*MySQL*, you will obtain better overall performance by adding
locks around everything that does more than about 5 inserts or
updates in a row. If you do very many inserts in a row, you could
do a `LOCK TABLES' followed by an `UNLOCK TABLES' once in a while
(about each 1000 rows) to allow other threads access to the table.
This would still result in a nice performance gain.
Of course, `LOAD DATA INFILE' is much faster for loading data.
To get some more speed for both `LOAD DATA INFILE' and `INSERT',
enlarge the key buffer. *Note Server parameters::.
Speed of `UPDATE' Queries
-------------------------
Update queries are optimized as a `SELECT' query with the additional
overhead of a write. The speed of the write is dependent on the size of
the data that is being updated and the number of indexes that are
updated. Indexes that are not changed will not be updated.
Also, another way to get fast updates is to delay updates and then do
many updates in a row later. Doing many updates in a row is much quicker
than doing one at a time if you lock the table.
Note that, with dynamic record format, updating a record to a longer
total length may split the record. So if you do this often, it is very
important to `OPTIMIZE TABLE' sometimes. *Note `OPTIMIZE TABLE':
OPTIMIZE TABLE.
Speed of `DELETE' Queries
-------------------------
If you want to delete all rows in the table, you should use `TRUNCATE
TABLE table_name'. *Note TRUNCATE::.
The time to delete a record is exactly proportional to the number of
indexes. To delete records more quickly, you can increase the size of
the index cache. *Note Server parameters::.
Other Optimization Tips
=======================
Unsorted tips for faster systems:
* Use persistent connections to the database to avoid the connection
overhead. If you can't use persistent connections and you are
doing a lot of new connections to the database, you may want to
change the value of the `thread_cache_size' variable. *Note Server
parameters::.
* Always check that all your queries really use the indexes you have
created in the tables. In *MySQL* you can do this with the
`EXPLAIN' command. *Note Explain: (manual)EXPLAIN.
* Try to avoid complex `SELECT' queries on tables that are updated a
lot. This is to avoid problems with table locking.
* The new `MyISAM' tables can insert rows in a table without deleted
rows at the same time another table is reading from it. If this
is important for you, you should consider methods where you don't
have to delete rows or run `OPTIMIZE TABLE' after you have deleted
a lot of rows.
* Use `ALTER TABLE ... ORDER BY expr1,expr2...' if you mostly
retrieve rows in expr1,expr2.. order. By using this option after
big changes to the table, you may be able to get higher
performance.
* In some cases it may make sense to introduce a column that is
'hashed' based on information from other columns. If this column
is short and reasonably unique it may be much faster than a big
index on many columns. In *MySQL* it's very easy to use this extra
column: `SELECT * FROM table_name WHERE hash=MD5(concat(col1,col2))
AND col_1='constant' AND col_2='constant''
* For tables that change a lot you should try to avoid all `VARCHAR'
or `BLOB' columns. You will get dynamic row length as soon as you
are using a single `VARCHAR' or `BLOB' column. *Note Table types::.
* It's not normally useful to split a table into different tables
just because the rows gets 'big'. To access a row, the biggest
performance hit is the disk seek to find the first byte of the
row. After finding the data most new disks can read the whole row
fast enough for most applications. The only cases where it really
matters to split up a table is if it's a dynamic row size table
(see above) that you can change to a fixed row size, or if you
very often need to scan the table and don't need most of the
columns. *Note Table types::.
* If you very often need to calculate things based on information
from a lot of rows (like counts of things), it's probably much
better to introduce a new table and update the counter in real
time. An update of type `UPDATE table set count=count+1 where
index_column=constant' is very fast!
This is really important when you use databases like *MySQL* that
only have table locking (multiple readers / single writers). This
will also give better performance with most databases, as the row
locking manager in this case will have less to do.
* If you need to collect statistics from big log tables, use summary
tables instead of scanning the whole table. Maintaining the
summaries should be much faster than trying to do statistics
'live'. It's much faster to regenerate new summary tables from the
logs when things change (depending on business decisions) than to
have to change the running application!
* If possible, one should classify reports as 'live' or
'statistical', where data needed for statistical reports are only
generated based on summary tables that are generated from the
actual data.
* Take advantage of the fact that columns have default values. Insert
values explicitly only when the value to be inserted differs from
the default. This reduces the parsing that *MySQL* need to do and
improves the insert speed.
* In some cases it's convenient to pack and store data into a blob.
In this case you have to add some extra code in your appliction to
pack/unpack things in the blob, but this may save a lot of
accesses at some stage. This is practical when you have data that
doesn't conform to a static table structure.
* Normally you should try to keep all data non-redundant (what is
called 3rd normal form in database theory), but you should not be
afraid of duplicating things or creating summary tables if you
need these to gain more speed.
* Stored procedures or UDF (user-defined functions) may be a good
way to get more performance. In this case you should, however,
always have a way to do this some other (slower) way if you use
some database that doesn't support this.
* You can always gain something by caching queries/answers in your
application and trying to do many inserts/updates at the same
time. If your database supports lock tables (like *MySQL* and
Oracle), this should help to ensure that the index cache is only
flushed once after all updates.
* Use `INSERT /*! DELAYED */' when you do not need to know when your
data is written. This speeds things up because many records can be
written with a single disk write.
* Use `INSERT /*! LOW_PRIORITY */' when you want your selects to be
more important.
* Use `SELECT /*! HIGH_PRIORITY */' to get selects that jump the
queue. That is, the select is done even if there is somebody
waiting to do a write.
* Use the multi-line `INSERT' statement to store many rows with one
SQL command (many SQL servers supports this).
* Use `LOAD DATA INFILE' to load bigger amounts of data. This is
faster than normal inserts and will be even faster when `myisamchk'
is integrated in `mysqld'.
* Use `AUTO_INCREMENT' columns to make unique values.
* Use `OPTIMIZE TABLE' once in a while to avoid fragmentation when
using dynamic table format. *Note `OPTIMIZE TABLE': OPTIMIZE TABLE.
* Use `HEAP' tables to get more speed when possible. *Note Table
types::.
* When using a normal Web server setup, images should be stored as
files. That is, store only a file reference in the database. The
main reason for this is that a normal Web server is much better at
caching files than database contents. So it it's much easier to
get a fast system if you are using files.
* Use in memory tables for non-critical data that are accessed often
(like information about the last shown banner for users that don't
have cookies).
* Columns with identical information in different tables should be
declared identical and have identical names. Before Version 3.23
you got slow joins otherwise.
Try to keep the names simple (use `name' instead of
`customer_name' in the customer table). To make your names portable
to other SQL servers you should keep them shorter than 18
characters.
* If you need REALLY high speed, you should take a look at the
low-level interfaces for data storage that the different SQL
servers support! For example, by accessing the *MySQL* `MyISAM'
directly, you could get a speed increase of 2-5 times compared to
using the SQL interface. To be able to do this the data must be
on the same server as the application, and usually it should only
be accessed by one process (because external file locking is
really slow). One could eliminate the above problems by
introducing low-level `MyISAM' commands in the *MySQL* server
(this could be one easy way to get more performance if needed).
By carefully designing the database interface, it should be quite
easy to support this types of optimization.
* In many cases it's faster to access data from a database (using a
live connection) than accessing a text file, just because the
database is likely to be more compact than the text file (if you
are using numerical data), and this will involve fewer disk
accesses. You will also save code because you don't have to parse
your text files to find line and column boundaries.
* You can also use replication to speed things up. *Note
Replication::.
* Declaring a table with `DELAY_KEY_WRITE=1' will make the updating
of indexes faster, as these are not logged to disk until the file
is closed. The downside is that you should run `myisamchk' on
these tables before you start `mysqld' to ensure that they are
okay if something killed `mysqld' in the middle. As the key
information can always be generated from the data, you should not
lose anything by using `DELAY_KEY_WRITE'.
Using Your Own Benchmarks
=========================
You should definately benchmark your application and database to find
out where the bottlenecks are. By fixing it (or by replacing the
bottleneck with a 'dummy module') you can then easily identify the next
bottleneck (and so on). Even if the overall performance for your
application is sufficient, you should at least make a plan for each
bottleneck, and decide how to solve it if someday you really need the
extra performance.
For an example of portable benchmark programs, look at the *MySQL*
benchmark suite. *Note *MySQL* Benchmarks: MySQL Benchmarks. You can
take any program from this suite and modify it for your needs. By doing
this, you can try different solutions to your problem and test which is
really the fastest solution for you.
It is very common that some problems only occur when the system is very
heavily loaded. We have had many customers who contact us when they
have a (tested) system in production and have encountered load
problems. In every one of these cases so far, it has been problems with
basic design (table scans are NOT good at high load) or OS/Library
issues. Most of this would be a *LOT* easier to fix if the systems were
not already in production.
To avoid problems like this, you should put some effort into
benchmarking your whole application under the worst possible load! You
can use Sasha's recent hack for this - super-smack
(http://www.mysql.com/Downloads/super-smack/super-smack-1.0.tar.gz).
As the name suggests, it can bring your system down to its knees if you
ask it, so make sure to use it only on your developement systems.
Design Choices
==============
*MySQL* keeps row data and index data in separate files. Many (almost
all) other databases mix row and index data in the same file. We
believe that the *MySQL* choice is better for a very wide range of
modern systems.
Another way to store the row data is to keep the information for each
column in a separate area (examples are SDBM and Focus). This will
cause a performance hit for every query that accesses more than one
column. Because this degenerates so quickly when more than one column
is accessed, we believe that this model is not good for general purpose
databases.
The more common case is that the index and data are stored together
(like in Oracle/Sybase et al). In this case you will find the row
information at the leaf page of the index. The good thing with this
layout is that it, in many cases, depending on how well the index is
cached, saves a disk read. The bad things with this layout are:
* Table scanning is much slower because you have to read through the
indexes to get at the data.
* You can't use only the index table to retrieve data for a query.
* You lose a lot of space, as you must duplicate indexes from the
nodes (as you can't store the row in the nodes).
* Deletes will degenerate the table over time (as indexes in nodes
are usually not updated on delete).
* It's harder to cache ONLY the index data.
MySQL Design Limitations/Tradeoffs
==================================
Because *MySQL* uses extremely fast table locking (multiple readers /
single writers) the biggest remaining problem is a mix of a steady
stream of inserts and slow selects on the same table.
We believe that for a huge number of systems the extremely fast
performance in other cases make this choice a win. This case is usually
also possible to solve by having multiple copies of the table, but it
takes more effort and hardware.
We are also working on some extensions to solve this problem for some
common application niches.
Portability
===========
Because all SQL servers implement different parts of SQL, it takes work
to write portable SQL applications. For very simple selects/inserts it
is very easy, but the more you need the harder it gets. If you want an
application that is fast with many databases it becomes even harder!
To make a complex application portable you need to choose a number of
SQL servers that it should work with.
You can use the *MySQL* crash-me program/web-page
`http://www.mysql.com/information/crash-me.php' to find functions,
types, and limits you can use with a selection of database servers.
Crash-me now tests far from everything possible, but it is still
comprehensive with about 450 things tested.
For example, you shouldn't have column names longer than 18 characters
if you want to be able to use Informix or DB2.
Both the *MySQL* benchmarks and crash-me programs are very
database-independent. By taking a look at how we have handled this, you
can get a feeling for what you have to do to write your application
database-independent. The benchmarks themselves can be found in the
`sql-bench' directory in the *MySQL* source distribution. They are
written in Perl with DBI database interface (which solves the access
part of the problem).
See `http://www.mysql.com/information/benchmarks.html' for the results
from this benchmark.
As you can see in these results, all databases have some weak points.
That is, they have different design compromises that lead to different
behavior.
If you strive for database independence, you need to get a good feeling
for each SQL server's bottlenecks. *MySQL* is VERY fast in retrieving
and updating things, but will have a problem in mixing slow
readers/writers on the same table. Oracle, on the other hand, has a big
problem when you try to access rows that you have recently updated
(until they are flushed to disk). Transaction databases in general are
not very good at generating summary tables from log tables, as in this
case row locking is almost useless.
To get your application _really_ database-independent, you need to
define an easy extendable interface through which you manipulate your
data. As C++ is available on most systems, it makes sense to use a C++
classes interface to the databases.
If you use some specific feature for some database (like the `REPLACE'
command in *MySQL*), you should code a method for the other SQL servers
to implement the same feature (but slower). With *MySQL* you can use
the `/*! */' syntax to add *MySQL*-specific keywords to a query. The
code inside `/**/' will be treated as a comment (ignored) by most other
SQL servers.
If REAL high performance is more important than exactness, as in some
Web applications, a possibility is to create an application layer that
caches all results to give you even higher performance. By letting old
results 'expire' after a while, you can keep the cache reasonably
fresh. This is quite nice in case of extremely high load, in which case
you can dynamically increase the cache and set the expire timeout higher
until things get back to normal.
In this case the table creation information should contain information
of the initial size of the cache and how often the table should normally
be refreshed.
What Have We Used MySQL For?
============================
During *MySQL* initial development, the features of *MySQL* were made
to fit our largest customer. They handle data warehousing for a couple
of the biggest retailers in Sweden.
From all stores, we get weekly summaries of all bonus card transactions,
and we are expected to provide useful information for the store owners
to help them find how their advertisement campaigns are affecting their
customers.
The data is quite huge (about 7 million summary transactions per month),
and we have data for 4-10 years that we need to present to the users.
We got weekly requests from the customers that they want to get
'instant' access to new reports from this data.
We solved this by storing all information per month in compressed
'transaction' tables. We have a set of simple macros (script) that
generates summary tables grouped by different criteria (product group,
customer id, store ...) from the transaction tables. The reports are
Web pages that are dynamically generated by a small Perl script that
parses a Web page, executes the SQL statements in it, and inserts the
results. We would have used PHP or mod_perl instead but they were not
available at that time.
For graphical data we wrote a simple tool in `C' that can produce GIFs
based on the result of a SQL query (with some processing of the
result). This is also dynamically executed from the Perl script that
parses the `HTML' files.
In most cases a new report can simply be done by copying an existing
script and modifying the SQL query in it. In some cases, we will need
to add more fields to an existing summary table or generate a new one,
but this is also quite simple, as we keep all transactions tables on
disk. (Currently we have at least 50G of transactions tables and 200G
of other customer data.)
We also let our customers access the summary tables directly with ODBC
so that the advanced users can themselves experiment with the data.
We haven't had any problems handling this with quite modest Sun Ultra
SPARCstation (2x200 Mhz). We recently upgraded one of our servers to a 2
CPU 400 Mhz UltraSPARC, and we are now planning to start handling
transactions on the product level, which would mean a ten-fold increase
of data. We think we can keep up with this by just adding more disk to
our systems.
We are also experimenting with Intel-Linux to be able to get more CPU
power cheaper. Now that we have the binary portable database format (new
in Version 3.23), we will start to use this for some parts of the
application.
Our initial feelings are that Linux will perform much better on
low-to-medium load and Solaris will perform better when you start to
get a high load because of extreme disk IO, but we don't yet have
anything conclusive about this. After some discussion with a Linux
Kernel developer, this might be a side effect of Linux giving so much
resources to the batch job that the interactive performance gets very
low. This makes the machine feel very slow and unresponsive while big
batches are going. Hopefully this will be better handled in future
Linux Kernels.
The MySQL Benchmark Suite
*************************
This should contain a technical description of the *MySQL* benchmark
suite (and `crash-me'), but that description is not written yet.
Currently, you can get a good idea of the benchmark by looking at the
code and results in the `sql-bench' directory in any *MySQL* source
distributions.
This benchmark suite is meant to be a benchmark that will tell any user
what things a given SQL implementation performs well or poorly at.
Note that this benchmark is single threaded, so it measures the minimum
time for the operations. We plan to in the future add a lot of
multi-threaded tests to the benchmark suite.
For example, (run on the same NT 4.0 machine):
*Reading 2000000 rows by index* *Seconds* *Seconds*
mysql 367 249
mysql_odbc 464
db2_odbc 1206
informix_odbc 121126
ms-sql_odbc 1634
oracle_odbc 20800
solid_odbc 877
sybase_odbc 17614
*Inserting (350768) rows* *Seconds* *Seconds*
mysql 381 206
mysql_odbc 619
db2_odbc 3460
informix_odbc 2692
ms-sql_odbc 4012
oracle_odbc 11291
solid_odbc 1801
sybase_odbc 4802
In the above test *MySQL* was run with a 8M index cache.
We have gather some more benchmark results at
`http://www.mysql.com/information/benchmarks.html'.
Note that Oracle is not included because they asked to be removed. All
Oracle benchmarks have to be passed by Oracle! We believe that makes
Oracle benchmarks *VERY* biased because the above benchmarks are
supposed to show what a standard installation can do for a single
client.
To run the benchmark suite, you have to download a MySQL source
distribution install the perl DBI driver, the perl DBD driver for the
database you want to test and then do:
cd sql-bench
perl run-all-tests --server=#
where # is one of supported servers. You can get a list of all options
and supported servers by doing `run-all-tests --help'.
`crash-me' tries to determine what features a database supports and
what it's capabilities and limitations are by actually running queries.
For example, it determines:
* What column types are supported
* How many indexes are supported
* What functions are supported
* How big a query can be
* How big a `VARCHAR' column can be
We can find the result from crash-me on a lot of different databases at
`http://www.mysql.com/information/crash-me.php'.
MySQL Utilites
**************
Overview of the Different MySQL Programs
========================================
All *MySQL* clients that communicate with the server using the
`mysqlclient' library use the following environment variables:
*Name* *Description*
`MYSQL_UNIX_PORT' The default socket; used for connections to
`localhost'
`MYSQL_TCP_PORT' The default TCP/IP port
`MYSQL_PWD' The default password
`MYSQL_DEBUG' Debug-trace options when debugging
`TMPDIR' The directory where temporary tables/files are created
Use of `MYSQL_PWD' is insecure. *Note Connecting::.
The `mysql' client uses the file named in the `MYSQL_HISTFILE'
environment variable to save the command-line history. The default
value for the history file is `$HOME/.mysql_history', where `$HOME' is
the value of the `HOME' environment variable. *Note Environment
variables::.
All *MySQL* programs take many different options. However, every
*MySQL* program provides a `--help' option that you can use to get a
full description of the program's different options. For example, try
`mysql --help'.
You can override default options for all standard client programs with
an option file. *Note Option files::.
The list below briefly describes the *MySQL* programs:
`myisamchk'
Utility to describe, check, optimize, and repair *MySQL* tables.
Because `myisamchk' has many functions, it is described in its own
chapter. *Note Maintenance::.
`make_binary_distribution'
Makes a binary release of a compiled *MySQL*. This could be sent
by FTP to `/pub/mysql/Incoming' on `support.mysql.com' for the
convenience of other *MySQL* users.
`msql2mysql'
A shell script that converts `mSQL' programs to *MySQL*. It doesn't
handle all cases, but it gives a good start when converting.
`mysqlaccess'
A script that checks the access privileges for a host, user, and
database combination.
`mysqladmin'
Utility for performing administrative operations, such as creating
or dropping databases, reloading the grant tables, flushing tables
to disk, and reopening log files. `mysqladmin' can also be used
to retrieve version, process, and status information from the
server. *Note `mysqladmin': mysqladmin.
`mysqlbug'
The *MySQL* bug report script. This script should always be used
when filing a bug report to the *MySQL* list.
`mysqld'
The SQL daemon. This should always be running.
`mysqldump'
Dumps a *MySQL* database into a file as SQL statements or as
tab-separated text files. Enhanced freeware originally by Igor
Romanenko. *Note `mysqldump': mysqldump.
`mysqlimport'
Imports text files into their respective tables using `LOAD DATA
INFILE'. *Note `mysqlimport': mysqlimport.
`mysqlshow'
Displays information about databases, tables, columns, and indexes.
`mysql_install_db'
Creates the *MySQL* grant tables with default privileges. This is
usually executed only once, when first installing *MySQL* on a
system.
`replace'
A utility program that is used by `msql2mysql', but that has more
general applicability as well. `replace' changes strings in place
in files or on the standard input. Uses a finite state machine to
match longer strings first. Can be used to swap strings. For
example, this command swaps `a' and `b' in the given files:
shell> replace a b b a -- file1 file2 ...
safe_mysqld, the wrapper around mysqld
======================================
`safe_mysqld' is the recommended way to start a `mysqld' daemon on
Unix. `safe_mysqld' adds some safety features such as restarting the
server when an error occurs and logging run-time information to a log
file.
Normally one should never edit the `safe_mysqld' script, but instead
put the options to `safe_mysqld' in the `[safe_mysqld]' section in the
`my.cnf' file. `safe_mysqld' will read all options from the `[mysqld]',
`[server]' and `[safe_mysqld]' sections from the option files. *Note
Option files::.
Note that all options on the command line to `safe_mysqld' are passed
to `mysqld'. If you wants to use any options in `safe_mysqld' that
`mysqld' doesn't support, you must specify these in the option file.
Most of the options to `safe_mysqld' are the same as the options to
`mysqld'. *Note Command-line options::.
`safe_mysqld' supports the following options:
`--basedir=path'
`--core-file-size=#'
Size of the core file `mysqld' should be able to create. Passed to
`ulimit -c'.
`--datadir=path'
`--defaults-extra-file=path'
`--defaults-file=path'
`--err-log=path'
`--ledir=path'
Path to `mysqld'
`--log=path'
`--mysqld=mysqld-version'
Name of the mysqld version in the `ledir' directory you want to
start.
`--no-defaults'
`--open-files-limit=#'
Number of files `mysqld' should be able to open. Passed to `ulimit
-n'. Note that you need to start `safe_mysqld' as root for this to
work properly!
`--pid-file=path'
`--port=#'
`--socket=path'
`--timezone=#'
Set the timezone (the `TZ') variable to the value of this
parameter.
`--user=#'
The `safe_mysqld' script is written so that it normally is able to start
a server that was installed from either a source or a binary version of
*MySQL*, even if these install the server in slightly different
locations. `safe_mysqld' expects one of these conditions to be true:
* The server and databases can be found relative to the directory
from which `safe_mysqld' is invoked. `safe_mysqld' looks under
its working directory for `bin' and `data' directories (for binary
distributions) or for `libexec' and `var' directories (for source
distributions). This condition should be met if you execute
`safe_mysqld' from your *MySQL* installation directory (for
example, `/usr/local/mysql' for a binary distribution).
* If the server and databases cannot be found relative to the
working directory, `safe_mysqld' attempts to locate them by
absolute pathnames. Typical locations are `/usr/local/libexec'
and `/usr/local/var'. The actual locations are determined when
the distribution was built from which `safe_mysqld' comes. They
should be correct if *MySQL* was installed in a standard location.
Because `safe_mysqld' will try to find the server and databases relative
to its own working directory, you can install a binary distribution of
*MySQL* anywhere, as long as you start `safe_mysqld' from the *MySQL*
installation directory:
shell> cd mysql_installation_directory
shell> bin/safe_mysqld &
If `safe_mysqld' fails, even when invoked from the *MySQL* installation
directory, you can modify it to use the path to `mysqld' and the
pathname options that are correct for your system. Note that if you
upgrade *MySQL* in the future, your modified version of `safe_mysqld'
will be overwritten, so you should make a copy of your edited version
that you can reinstall.
mysqld_multi, program for managing multiple *MySQL* servers
===========================================================
`mysqld_multi' is meant for managing several `mysqld' processes running
in different UNIX sockets and TCP/IP ports.
The program will search for group(s) named [mysqld#] from my.cnf (or the
given -config-file=...), where # can be any positive number starting
from 1. These groups should be the same as the usual `[mysqld]' group
(e.g. options to mysqld, see *MySQL* manual for detailed information
about this group), but with those port, socket etc. options that are
wanted for each separate `mysqld' processes. The number in the group
name has another function; it can be used for starting, stopping, or
reporting some specific `mysqld' servers with this program. See the
usage and options below for more information.
Usage: mysqld_multi [OPTIONS] {start|stop|report} [GNR,GNR,GNR...]
or mysqld_multi [OPTIONS] {start|stop|report} [GNR-GNR,GNR,GNR-GNR,...]
The GNR above means the group number. You can start, stop or report any
GNR, or several of them at the same time. (See -example) The GNRs list
can be comma separated, or a dash combined, of which the latter means
that all the GNRs between GNR1-GNR2 will be affected. Without GNR
argument all the found groups will be either started, stopped, or
reported. Note that you must not have any white spaces in the GNR list.
Anything after a white space are ignored.
`mysqld_multi' supports the following options:
`--config-file=...'
Alternative config file. NOTE: This will not affect this program's
own options (group `[mysqld_multi]'), but only groups [mysqld#].
Without this option everything will be searched from the ordinary
my.cnf file.
`--example'
Give an example of a config file.
`--help'
Print this help and exit.
`--log=...'
Log file. Full path to and the name for the log file. NOTE: If the
file exists, everything will be appended.
`--mysqladmin=...'
`mysqladmin' binary to be used for a server shutdown.
`--mysqld=...'
`mysqld' binary to be used. Note that you can give `safe_mysqld'
to this option also. The options are passed to `mysqld'. Just make
sure you have `mysqld' in your environment variable `PATH' or fix
`safe_mysqld'.
`--no-log'
Print to stdout instead of the log file. By default the log file is
turned on.
`--password=...'
Password for user for `mysqladmin'.
`--tcp-ip'
Connect to the *MySQL* server(s) via the TCP/IP port instead of
the UNIX socket. This affects stopping and reporting. If a socket
file is missing, the server may still be running, but can be
accessed only via the TCP/IP port. By default connecting is done
via the UNIX socket.
`--user=...'
*MySQL* user for `mysqladmin'.
`--version'
Print the version number and exit.
Some notes about `mysqld_multi':
* Make sure that the *MySQL* user, who is stopping the `mysqld'
services (e.g using the `mysqladmin') have the same password and
username for all the data directories accessed (to the 'mysql'
database) And make sure that the user has the 'Shutdown_priv'
privilege! If you have many data- directories and many different
'mysql' databases with different passwords for the *MySQL* 'root'
user, you may want to create a common 'multi_admin' user for each
using the same password (see below). Example how to do it:
shell> mysql -u root -S /tmp/mysql.sock -proot_password -e
"GRANT SHUTDOWN ON *.* TO multi_admin@localhost IDENTIFIED BY 'multipass'"
*Note Privileges::.
You will have to do the above for each `mysqld' running in each
data directory, that you have (just change the socket, -S=...)
* `pid-file' is very important, if you are using `safe_mysqld' to
start `mysqld' (e.g. -mysqld=safe_mysqld) Every `mysqld' should
have it's own `pid-file'. The advantage using `safe_mysqld'
instead of `mysqld' directly here is, that `safe_mysqld' 'guards'
every `mysqld' process and will restart it, if a `mysqld' process
fails due to signal kill -9, or similar. (Like segmentation fault,
which *MySQL* should never do, of course ;) Please note that
`safe_mysqld' script may require that you start it from a certain
place. This means that you may have to CD to a certain directory,
before you start the `mysqld_multi'. If you have problems
starting, please see the `safe_mysqld' script. Check especially
the lines:
--------------------------------------------------------------------------
MY_PWD=`pwd` Check if we are starting this relative (for the binary
release) if test -d /data/mysql -a -f ./share/mysql/english/errmsg.sys
-a -x ./bin/mysqld
--------------------------------------------------------------------------
*Note safe_mysqld::.
The above test should be successful, or you may encounter problems.
* Beware of the dangers starting multiple `mysqlds' in the same data
directory. Use separate data directories, unless you *KNOW* what
you are doing!
* The socket file and the TCP/IP port must be different for every
`mysqld'.
* The first and fifth `mysqld' group were intentionally left out from
the example. You may have 'gaps' in the config file. This gives
you more flexibility. The order in which the `mysqlds' are
started or stopped depends on the order in which they appear in
the config file.
* When you want to refer to a certain group using GNR with this
program, just use the number in the end of the group name (
[mysqld# <== ).
* You may want to use option '-user' for `mysqld', but in order to
do this you need to be root when you start the `mysqld_multi'
script. Having the option in the config file doesn't matter; you
will just get a warning, if you are not the superuser and the
`mysqlds' are started under *YOUR* UNIX account. *IMPORTANT*: Make
sure that the `pid-file' and the data directory are
read+write(+execute for the latter one) accessible for *THAT* UNIX
user, who the specific `mysqld' process is started as. *DON'T* use
the UNIX root account for this, unless you *KNOW* what you are
doing!
* *MOST IMPORTANT*: Make sure that you understand the meanings of
the options that are passed to the `mysqlds' and why *WOULD YOU
WANT* to have separate `mysqld' processes. Starting multiple
`mysqlds' in one data directory *WILL NOT* give you extra
performance in a threaded system!
*Note Multiple servers::.
This is an example of the config file on behalf of `mysqld_multi'.
# This file should probably be in your home dir (~/.my.cnf) or /etc/my.cnf
# Version 2.1 by Jani Tolonen
[mysqld_multi]
mysqld = /usr/local/bin/safe_mysqld
mysqladmin = /usr/local/bin/mysqladmin
user = multi_admin
password = multipass
[mysqld2]
socket = /tmp/mysql.sock2
port = 3307
pid-file = /usr/local/mysql/var2/hostname.pid2
datadir = /usr/local/mysql/var2
language = /usr/local/share/mysql/english
user = john
[mysqld3]
socket = /tmp/mysql.sock3
port = 3308
pid-file = /usr/local/mysql/var3/hostname.pid3
datadir = /usr/local/mysql/var3
language = /usr/local/share/mysql/swedish
user = monty
[mysqld4]
socket = /tmp/mysql.sock4
port = 3309
pid-file = /usr/local/mysql/var4/hostname.pid4
datadir = /usr/local/mysql/var4
language = /usr/local/share/mysql/estonia
user = tonu
[mysqld6]
socket = /tmp/mysql.sock6
port = 3311
pid-file = /usr/local/mysql/var6/hostname.pid6
datadir = /usr/local/mysql/var6
language = /usr/local/share/mysql/japanese
user = jani
*Note Option files::.
The Command-line Tool
=====================
`mysql' is a simple SQL shell (with GNU `readline' capabilities). It
supports interactive and non-interactive use. When used interactively,
query results are presented in an ASCII-table format. When used
non-interactively (for example, as a filter), the result is presented in
tab-separated format. (The output format can be changed using
command-line options.) You can run scripts simply like this:
shell> mysql database < script.sql > output.tab
If you have problems due to insufficient memory in the client, use the
`--quick' option! This forces `mysql' to use `mysql_use_result()'
rather than `mysql_store_result()' to retrieve the result set.
Using `mysql' is very easy. Just start it as follows: `mysql database'
or `mysql --user=user_name --password=your_password database'. Type a
SQL statement, end it with `;', `g', or `G' and press RETURN/ENTER.
`mysql' supports the following options:
`-?, --help'
Display this help and exit.
`-A, --no-auto-rehash'
No automatic rehashing. One has to use 'rehash' to get table and
field completion. This gives a quicker start of mysql.
`-B, --batch'
Print results with a tab as separator, each row on a new line.
Doesn't use history file.
`--character-sets-dir=...'
Directory where character sets are located.
`-C, --compress'
Use compression in server/client protocol.
`-#, --debug[=...]'
Debug log. Default is 'd:t:o,/tmp/mysql.trace'.
`-D, --database=...'
Database to use. This is mainly useful in the `my.cnf' file.
`--default-character-set=...'
Set the default character set.
`-e, --execute=...'
Execute command and quit. (Output like with -batch)
`-E, --vertical'
Print the output of a query (rows) vertically. Without this option
you can also force this output by ending your statements with `G'.
`-f, --force'
Continue even if we get a SQL error.
`-g, --no-named-commands'
Named commands are disabled. Use * form only, or use named
commands only in the beginning of a line ending with a semicolon
(;). Since Version 10.9, the client now starts with this option
ENABLED by default! With the -g option, long format commands will
still work from the first line, however.
`-G, --enable-named-commands'
Named commands are *enabled*. Long format commands are allowed as
well as shortened * commands.
`-i, --ignore-space'
Ignore space after function names.
`-h, --host=...'
Connect to the given host.
`-H, --html'
Produce HTML output.
`-L, --skip-line-numbers'
Don't write line number for errors. Useful when one wants to
compare result files that includes error messages
`--no-pager'
Disable pager and print to stdout. See interactive help (h) also.
`--no-tee'
Disable outfile. See interactive help (h) also.
`-n, --unbuffered'
Flush buffer after each query.
`-N, --skip-column-names'
Don't write column names in results.
`-O, --set-variable var=option'
Give a variable a value. `--help' lists variables.
`-o, --one-database'
Only update the default database. This is useful for skipping
updates to other database in the update log.
``--pager[=...]''
Output type. Default is your `ENV' variable `PAGER'. Valid pagers
are less, more, cat [> filename], etc. See interactive help (h)
also. This option does not work in batch mode. Pager works only in
UNIX.
`-p[password], --password[=...]'
Password to use when connecting to server. If a password is not
given on the command line, you will be prompted for it. Note that
if you use the short form `-p' you can't have a space between the
option and the password.
`-P --port=...'
TCP/IP port number to use for connection.
`-q, --quick'
Don't cache result, print it row-by-row. This may slow down the
server if the output is suspended. Doesn't use history file.
`-r, --raw'
Write column values without escape conversion. Used with `--batch'
`-s, --silent'
Be more silent.
`-S --socket=...'
Socket file to use for connection.
`-t --table'
Output in table format. This is default in non-batch mode.
`-T, --debug-info'
Print some debug information at exit.
`--tee=...'
Append everything into outfile. See interactive help (h) also.
Does not work in batch mode.
`-u, --user=#'
User for login if not current user.
`-U, --safe-updates[=#], --i-am-a-dummy[=#]'
Only allow `UPDATE' and `DELETE' that uses keys. See below for
more information about this option. You can reset this option if
you have it in your `my.cnf' file by using `--safe-updates=0'.
`-v, --verbose'
More verbose output (-v -v -v gives the table output format).
`-V, --version'
Output version information and exit.
`-w, --wait'
Wait and retry if connection is down instead of aborting.
You can also set the following variables with `-O' or `--set-variable':
Variablename Default Description
connect_timeout 0 Number of seconds before timeout
connection.
max_allowed_packet 16777216 Max packetlength to send/receive
from to server
net_buffer_length 16384 Buffer for TCP/IP and socket
communication
select_limit 1000 Automatic limit for SELECT when
using -i-am-a-dummy
max_join_size 1000000 Automatic limit for rows in a join
when using -i-am-a-dummy.
If you type 'help' on the command line, `mysql' will print out the
commands that it supports:
mysql> help
MySQL commands:
help (h) Display this text.
? (h) Synonym for `help'.
clear (c) Clear command.
connect (r) Reconnect to the server. Optional arguments are db and host.
edit (e) Edit command with $EDITOR.
ego (G) Send command to mysql server, display result vertically.
exit (q) Exit mysql. Same as quit.
go (g) Send command to mysql server.
nopager (n) Disable pager, print to stdout.
notee (t) Don't write into outfile.
pager (P) Set PAGER [to_pager]. Print the query results via PAGER.
print (p) Print current command.
quit (q) Quit mysql.
rehash (#) Rebuild completion hash.
source (.) Execute a SQL script file. Takes a file name as an argument.
status (s) Get status information from the server.
tee (T) Set outfile [to_outfile]. Append everything into given outfile.
use (u) Use another database. Takes database name as argument.
From the above, pager only works in UNIX.
The `status' command gives you some information about the connection
and the server you are using. If you are running in the
`--safe-updates' mode, `status' will also print the values for the
`mysql' variables that affect your queries.
A useful startup option for beginners (introduced in *MySQL* Version
3.23.11) is `--safe-mode' (or `--i-am-a-dummy' for users that has at
some time done a `DELETE FROM table_name' but forgot the `WHERE'
clause. When using this option, `mysql' sends the following command to
the *MySQL* server when opening the connection:
SET SQL_SAFE_UPDATES=1,SQL_SELECT_LIMIT=#select_limit#,
SQL_MAX_JOIN_SIZE=#max_join_size#"
where `#select_limit#' and `#max_join_size#' are variables that can be
set from the `mysql' command line. *Note `SET': SET OPTION.
The effect of the above is:
* You are not allowed to do an `UPDATE' or `DELETE' statement if you
don't have a key constraint in the `WHERE' part. One can, however,
force an `UPDATE/DELETE' by using `LIMIT':
UPDATE table_name SET not_key_column=# WHERE not_key_column=# LIMIT 1;
* All big results are automatically limited to `#select_limit#' rows.
* `SELECT''s that will probably need to examine more than
`#max_join_size' row combinations will be aborted.
Some useful hints about the `mysql' client:
Some data is much more readable when displayed vertically, instead of
the usual horizontal box type output. For example longer text, which
includes new lines, is often much easier to be read with vertical
output.
mysql> select * from mails where length(txt) < 300 limit 300,1G
*************************** 1. row ***************************
msg_nro: 3068
date: 2000-03-01 23:29:50
time_zone: +0200
mail_from: Monty
reply: monty@no.spam.com
mail_to: "Thimble Smith" <tim@no.spam.com>
sbj: UTF-8
txt: >>>>> "Thimble" == Thimble Smith writes:
Thimble> Hi. I think this is a good idea. Is anyone familiar with UTF-8
Thimble> or Unicode? Otherwise I'll put this on my TODO list and see what
Thimble> happens.
Yes, please do that.
Regards,
Monty
file: inbox-jani-1
hash: 190402944
1 row in set (0.09 sec)
* For logging, you can use the `tee' option. The `tee' can be
started with option `--tee=...', or from the command line
interactively with command `tee'. All the data displayed on the
screen will also be appended into a given file. This can be very
useful for debugging purposes also. The `tee' can be disabled from
the command line with command `notee'. Executing `tee' again
starts logging again. Without a parameter the previous file will be
used. Note that `tee' will flush the results into the file after
each command, just before the command line appears again waiting
for the next command.
* Browsing, or searching the results in the interactive mode in UNIX
less, more, or any other similar program, is now possible with
option `--pager[=...]'. Without argument, `mysql' client will look
for environment variable PAGER and set `pager' to that. `pager'
can be started from the interactive command line with command
`pager' and disabled with command `nopager'. The command takes an
argument optionally and the `pager' will be set to that. Command
`pager' can be called without an argument, but this requires that
the option `--pager' was used, or the `pager' will default to
stdout. `pager' works only in UNIX, since it uses the popen()
function, which doesn't exist in Windows. In Windows, the `tee'
option can be used instead, although it may not be as handy as
`pager' can be in some situations.
* A few tips about `pager': You can use it to write to a file:
mysql> pager cat > /tmp/log.txt
and the results will only go to a file. You can also pass any
options for the programs that you want to use with the `pager':
mysql> pager less -n -i -S
From the above do note the option '-S'. You may find it very
useful when browsing the results; try the option with horizontal
output (end commands with 'g', or ';') and with vertical output
(end commands with 'G'). Sometimes a very wide result set is hard
to be read from the screen, with option -S to less you can browse
the results within the interactive less from left to right,
preventing lines longer than your screen from being continued to
the next line. This can make the result set much more readable.
You can swith the mode between on and off within the interactive
less with '-S'. See the 'h' for more help about less.
* Last (unless you already understood this from the above examples
;) you can combine very complex ways to handle the results, for
example the following would send the results to two files in two
different directories, on two different hard-disks mounted on /dr1
and /dr2, yet let the results still be seen on the screen via less:
mysql> pager cat | tee /dr1/tmp/res.txt | tee /dr2/tmp/res2.txt | less -n -i -S
* You can also combine the two functions above; have the `tee'
enabled, `pager' set to 'less' and you will be able to browse the
results in unix 'less' and still have everything appended into a
file the same time. The difference between `UNIX tee' used with the
`pager' and the `mysql' client in-built `tee', is that the
in-built `tee' works even if you don't have the `UNIX tee'
available. The in-built `tee' also logs everything that is printed
on the screen, where the `UNIX tee' used with `pager' doesn't log
quite that much. Last, but not least, the interactive `tee' is
more handy to switch on and off, when you want to log something
into a file, but want to be able to turn the feature off sometimes.
Administering a MySQL Server
============================
A utility for performing administrative operations. The syntax is:
shell> mysqladmin [OPTIONS] command [command-option] command ...
You can get a list of the options your version of `mysqladmin' supports
by executing `mysqladmin --help'.
The current `mysqladmin' supports the following commands:
create databasename Create a new database.
drop databasename Delete a database and all its tables.
extended-status Gives an extended status message from the server.
flush-hosts Flush all cached hosts.
flush-logs Flush all logs.
flush-tables Flush all tables.
flush-privileges Reload grant tables (same as reload).
kill id,id,... Kill mysql threads.
password New-password. Change old password to new-password.
ping Check if mysqld is alive.
processlist Show list of active threads in server.
reload Reload grant tables.
refresh Flush all tables and close and open logfiles.
shutdown Take server down.
slave-start Start slave replication thread.
slave-stop Stop slave replication thread.
status Gives a short status message from the server.
variables Prints variables available.
version Get version info from server.
All commands can be shortened to their unique prefix. For example:
shell> mysqladmin proc stat
+----+-------+-----------+----+-------------+------+-------+------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-------+-----------+----+-------------+------+-------+------+
| 6 | monty | localhost | | Processlist | 0 | | |
+----+-------+-----------+----+-------------+------+-------+------+
Uptime: 10077 Threads: 1 Questions: 9 Slow queries: 0 Opens: 6 Flush tables: 1 Open tables: 2 Memory in use: 1092K Max memory used: 1116K
The `mysqladmin status' command result has the following columns:
Uptime Number of seconds the *MySQL* server has been up.
Threads Number of active threads (clients).
Questions Number of questions from clients since `mysqld'
was started.
Slow queries Queries that have taken more than
`long_query_time' seconds. *Note Slow query log::.
Opens How many tables `mysqld' has opened.
Flush tables Number of `flush ...', `refresh', and `reload'
commands.
Open tables Number of tables that are open now.
Memory in use Memory allocated directly by the mysqld code
(only available when *MySQL* is compiled with
-with-debug).
Max memory used Maximum memory allocated directly by the mysqld
code (only available when *MySQL* is compiled
with -with-debug).
If you do `myslqadmin shutdown' on a socket (in other words, on a the
computer where `mysqld' is running), `mysqladmin' will wait until the
*MySQL* `pid-file' is removed to ensure that the `mysqld' server has
stopped properly.
Dumping the Structure and Data from MySQL Databases and Tables
==============================================================
Utility to dump a database or a collection of database for backup or for
transferring the data to another SQL server (not necessarily a MySQL
server). The dump will contain SQL statements to create the table
and/or populate the table.
If you are doing a backup on the server, you should consider using the
`mysqlhotcopy' instead. *Note mysqlhotcopy::.
shell> mysqldump [OPTIONS] database [tables]
OR mysqldump [OPTIONS] --databases [OPTIONS] DB1 [DB2 DB3...]
OR mysqldump [OPTIONS] --all-databases [OPTIONS]
If you don't give any tables or use the `--databases' or
`--all-databases', the whole database(s) will be dumped.
You can get a list of the options your version of `mysqldump' supports
by executing `mysqldump --help'.
Note that if you run `mysqldump' without `--quick' or `--opt',
`mysqldump' will load the whole result set into memory before dumping
the result. This will probably be a problem if you are dumping a big
database.
Note that if you are using a new copy of the `mysqldump' program and
you are going to do a dump that will be read into a very old *MySQL*
server, you should not use the `--opt' or `-e' options.
`mysqldump' supports the following options:
`--add-locks'
Add `LOCK TABLES' before and `UNLOCK TABLE' after each table dump.
(To get faster inserts into *MySQL*.)
`--add-drop-table'
Add a `drop table' before each create statement.
`-A, --all-databases'
Dump all the databases. This will be same as `--databases' with all
databases selected.
`-a, --all'
Include all *MySQL*-specific create options.
`--allow-keywords'
Allow creation of column names that are keywords. This works by
prefixing each column name with the table name.
`-c, --complete-insert'
Use complete insert statements (with column names).
`-C, --compress'
Compress all information between the client and the server if both
support compression.
`-B, --databases'
To dump several databases. Note the difference in usage. In this
case no tables are given. All name arguments are regarded as
databasenames. `USE db_name;' will be included in the output
before each new database.
`--delayed'
Insert rows with the `INSERT DELAYED' command.
`-e, --extended-insert'
Use the new multiline `INSERT' syntax. (Gives more compact and
faster inserts statements.)
`-#, --debug[=option_string]'
Trace usage of the program (for debugging).
`--help'
Display a help message and exit.
`--fields-terminated-by=...'
`--fields-enclosed-by=...'
`--fields-optionally-enclosed-by=...'
`--fields-escaped-by=...'
`--lines-terminated-by=...'
These options are used with the `-T' option and have the same
meaning as the corresponding clauses for `LOAD DATA INFILE'.
*Note `LOAD DATA': LOAD DATA.
`-F, --flush-logs'
Flush log file in the *MySQL* server before starting the dump.
`-f, --force,'
Continue even if we get a SQL error during a table dump.
`-h, --host=..'
Dump data from the *MySQL* server on the named host. The default
host is `localhost'.
`-l, --lock-tables.'
Lock all tables before starting the dump. The tables are locked
with `READ LOCAL' to allow concurrent inserts in the case of
`MyISAM' tables.
`-n, --no-create-db'
'CREATE DATABASE /*!32312 IF NOT EXISTS*/ db_name;' will not be
put in the output. The above line will be added otherwise, if
-databases or -all-databases option was given.
`-t, --no-create-info'
Don't write table creation information (The `CREATE TABLE'
statement.)
`-d, --no-data'
Don't write any row information for the table. This is very
useful if you just want to get a dump of the structure for a table!
`--opt'
Same as `--quick --add-drop-table --add-locks --extended-insert
--lock-tables'. Should give you the fastest possible dump for
reading into a *MySQL* server.
`-pyour_pass, --password[=your_pass]'
The password to use when connecting to the server. If you specify
no `=your_pass' part, `mysqldump' you will be prompted for a
password.
`-P port_num, --port=port_num'
The TCP/IP port number to use for connecting to a host. (This is
used for connections to hosts other than `localhost', for which
Unix sockets are used.)
`-q, --quick'
Don't buffer query, dump directly to stdout. Uses
`mysql_use_result()' to do this.
`-S /path/to/socket, --socket=/path/to/socket'
The socket file to use when connecting to `localhost' (which is the
default host).
`--tables'
Overrides option -databases (-B).
`-T, --tab=path-to-some-directory'
Creates a `table_name.sql' file, that contains the SQL CREATE
commands, and a `table_name.txt' file, that contains the data, for
each give table. *NOTE*: This only works if `mysqldump' is run on
the same machine as the `mysqld' daemon. The format of the `.txt'
file is made according to the `--fields-xxx' and `--lines--xxx'
options.
`-u user_name, --user=user_name'
The *MySQL* user name to use when connecting to the server. The
default value is your Unix login name.
`-O var=option, --set-variable var=option'
Set the value of a variable. The possible variables are listed
below.
`-v, --verbose'
Verbose mode. Print out more information on what the program does.
`-V, --version'
Print version information and exit.
`-w, --where='where-condition''
Dump only selected records. Note that QUOTES are mandatory:
"--where=user='jimf'" "-wuserid>1" "-wuserid<1"
`-O net_buffer_length=#, where # < 16M'
When creating multi-row-insert statements (as with option
`--extended-insert' or `--opt'), `mysqldump' will create rows up
to `net_buffer_length' length. If you increase this variable, you
should also ensure that the `max_allowed_packet' variable in the
*MySQL* server is bigger than the `net_buffer_length'.
The most normal use of `mysqldump' is probably for making a backup of
whole databases. *Note Backup::.
mysqldump --opt database > backup-file.sql
You can read this back into *MySQL* with:
mysql database < backup-file.sql
or
mysql -e "source /patch-to-backup/backup-file.sql" database
However, it's also very useful to populate another *MySQL* server with
information from a database:
mysqldump --opt database | mysql ---host=remote-host -C database
It is possible to dump several databases with one command:
mysqldump --databases database1 [database2 database3...] > my_databases.sql
If all the databases are wanted, one can use:
mysqldump --all-databases > all_databases.sql
Copying MySQL Databases and Tables
==================================
`mysqlhotcopy' is a perl script that uses `LOCK TABLES', `FLUSH TABLES'
and `cp' or `scp' to quickly make a backup of a database. It's the
fastest way to make a backup of the database, but it can only be run on
the same machine where the database directories are.
mysqlhotcopy db_name [/path/to/new_directory]
mysqlhotcopy db_name_1 ... db_name_n /path/to/new_directory
mysqlhotcopy db_name./regex/
`mysqlhotcopy' supports the following options:
`-?, --help'
Display a helpscreen and exit
`-u, --user=#'
User for database login
`-p, --password=#'
Password to use when connecting to server
`-P, --port=#'
Port to use when connecting to local server
`-S, --socket=#'
Socket to use when connecting to local server
`--allowold'
Don't abort if target already exists (rename it _old)
`--keepold'
Don't delete previous (now renamed) target when done
`--noindices'
Don't include full index files in copy to make the backup smaller
and faster The indexes can later be reconstructed with `myisamchk
-rq.'.
`--method=#'
Method for copy (`cp' or `scp').
`-q, --quiet'
Be silent except for errors
`--debug'
Enable debug
`-n, --dryrun'
Report actions without doing them
`--regexp=#'
Copy all databases with names matching regexp
`--suffix=#'
Suffix for names of copied databases
`--checkpoint=#'
Insert checkpoint entry into specified db.table
`--flushlog'
Flush logs once all tables are locked.
`--tmpdir=#'
Temporary directory (instead of /tmp).
You can use 'perldoc mysqlhotcopy' to get a more complete documentation
for `mysqlhotcopy'.
`mysqlhotcopy' reads the group `[mysqlhotcopy]' from the option files.
Importing Data from Text Files
==============================
`mysqlimport' provides a command-line interface to the `LOAD DATA
INFILE' SQL statement. Most options to `mysqlimport' correspond
directly to the same options to `LOAD DATA INFILE'. *Note `LOAD DATA':
LOAD DATA.
`mysqlimport' is invoked like this:
shell> mysqlimport [options] database textfile1 [textfile2....]
For each text file named on the command line, `mysqlimport' strips any
extension from the filename and uses the result to determine which
table to import the file's contents into. For example, files named
`patient.txt', `patient.text', and `patient' would all be imported into
a table named `patient'.
`mysqlimport' supports the following options:
`-c, --columns=...'
This option takes a comma-separated list of field names as an
argument. The field list is passed to LOAD DATA INFILE MySQL sql
command, which mysqlimport calls MySQL to execute. For more
information, please see `LOAD DATA INFILE'. *Note `LOAD DATA':
LOAD DATA.
`-C, --compress'
Compress all information between the client and the server if both
support compression.
`-#, --debug[=option_string]'
Trace usage of the program (for debugging).
`-d, --delete'
Empty the table before importing the text file.
`--fields-terminated-by=...'
`--fields-enclosed-by=...'
`--fields-optionally-enclosed-by=...'
`--fields-escaped-by=...'
`--lines-terminated-by=...'
These options have the same meaning as the corresponding clauses
for `LOAD DATA INFILE'. *Note `LOAD DATA': LOAD DATA.
`-f, --force'
Ignore errors. For example, if a table for a text file doesn't
exist, continue processing any remaining files. Without `--force',
`mysqlimport' exits if a table doesn't exist.
`--help'
Display a help message and exit.
`-h host_name, --host=host_name'
Import data to the *MySQL* server on the named host. The default
host is `localhost'.
`-i, --ignore'
See the description for the `--replace' option.
`-l, --lock-tables'
Lock *ALL* tables for writing before processing any text files.
This ensures that all tables are synchronized on the server.
`-L, --local'
Read input files from the client. By default, text files are
assumed to be on the server if you connect to `localhost' (which
is the default host).
`-pyour_pass, --password[=your_pass]'
The password to use when connecting to the server. If you specify
no `=your_pass' part, `mysqlimport' you will be prompted for a
password.
`-P port_num, --port=port_num'
The TCP/IP port number to use for connecting to a host. (This is
used for connections to hosts other than `localhost', for which
Unix sockets are used.)
`-r, --replace'
The `--replace' and `--ignore' options control handling of input
records that duplicate existing records on unique key values. If
you specify `--replace', new rows replace existing rows that have
the same unique key value. If you specify `--ignore', input rows
that duplicate an existing row on a unique key value are skipped.
If you don't specify either option, an error occurs when a
duplicate key value is found, and the rest of the text file is
ignored.
`-s, --silent'
Silent mode. Write output only when errors occur.
`-S /path/to/socket, --socket=/path/to/socket'
The socket file to use when connecting to `localhost' (which is the
default host).
`-u user_name, --user=user_name'
The *MySQL* user name to use when connecting to the server. The
default value is your Unix login name.
`-v, --verbose'
Verbose mode. Print out more information what the program does.
`-V, --version'
Print version information and exit.
Here is a sample run using `mysqlimport':
$ mysql --version
mysql Ver 9.33 Distrib 3.22.25, for pc-linux-gnu (i686)
$ uname -a
Linux xxx.com 2.2.5-15 #1 Mon Apr 19 22:21:09 EDT 1999 i586 unknown
$ mysql -e 'CREATE TABLE imptest(id INT, n VARCHAR(30))' test
$ ed
a
100 Max Sydow
101 Count Dracula
.
w imptest.txt
32
q
$ od -c imptest.txt
0000000 1 0 0 t M a x S y d o w n 1 0
0000020 1 t C o u n t D r a c u l a n
0000040
$ mysqlimport --local test imptest.txt
test.imptest: Records: 2 Deleted: 0 Skipped: 0 Warnings: 0
$ mysql -e 'SELECT * FROM imptest' test
+------+---------------+
| id | n |
+------+---------------+
| 100 | Max Sydow |
| 101 | Count Dracula |
+------+---------------+
Converting an error code to the corresponding error message
===========================================================
`perror' can be used to print error message(s). `perror' can be invoked
like this:
shell> perror [OPTIONS] [ERRORCODE [ERRORCODE...]]
For example:
shell> perror 64 79
Error code 64: Machine is not on the network
Error code 79: Can not access a needed shared library
`perror' can be used to display a description for a system error code,
or an MyISAM/ISAM table handler error code. The error messages are
mostly system dependent.
Showing Databases, Tables, and Columns
======================================
`mysqlshow' can be used to quickly look at which databases exist, their
tables, and the table's columns.
With the `mysql' program you can get the same information with the
`SHOW' commands. *Note SHOW::.
`mysqlshow' is invoked like this:
shell> mysqlshow [OPTIONS] [database [table [column]]]
* If no database is given, all matching databases are shown.
* If no table is given, all matching tables in the database are
shown.
* If no column is given, all matching columns and column types in
the table are shown.
Note that in newer *MySQL* versions, you only see those
database/tables/columns for which you have some privileges.
If the last argument contains a shell or SQL wild-card (`*', `?', `%'
or `_') then only what's matched by the wild card is shown. This may
cause some confusion when you try to display the columns for a table
with a `_' as in this case `mysqlshow' only shows you the table names
that match the pattern. This is easily fixed by adding an extra `%'
last on the command line (as a separate argument).
The MySQL Compressed Read-only Table Generator
==============================================
`myisampack' is used to compress MyISAM tables, and `pack_isam' is used
to compress ISAM tables. Because ISAM tables are deprecated, we will
only discuss `myisampack' here, but everything said about `myisampack'
should also be true for `pack_isam'.
`myisampack' works by compressing each column in the table separately.
The information needed to decompress columns is read into memory when
the table is opened. This results in much better performance when
accessing individual records, because you only have to uncompress
exactly one record, not a much larger disk block as when using Stacker
on MS-DOS. Usually, `myisampack' packs the data file 40%-70%.
*MySQL* uses memory mapping (`mmap()') on compressed tables and falls
back to normal read/write file usage if `mmap()' doesn't work.
There are currently two limitations with `myisampack':
* After packing, the table is read-only.
* `myisampack' can also pack `BLOB' or `TEXT' columns. The older
`pack_isam' could not do this.
Fixing these limitations is on our TODO list but with low priority.
`myisampack' is invoked like this:
shell> myisampack [options] filename ...
Each filename should be the name of an index (`.MYI') file. If you are
not in the database directory, you should specify the pathname to the
file. It is permissible to omit the `.MYI' extension.
`myisampack' supports the following options:
`-b, --backup'
Make a backup of the table as `tbl_name.OLD'.
`-#, --debug=debug_options'
Output debug log. The `debug_options' string often is
`'d:t:o,filename''.
`-f, --force'
Force packing of the table even if it becomes bigger or if the
temporary file exists. `myisampack' creates a temporary file
named `tbl_name.TMD' while it compresses the table. If you kill
`myisampack', the `.TMD' file may not be deleted. Normally,
`myisampack' exits with an error if it finds that `tbl_name.TMD'
exists. With `--force', `myisampack' packs the table anyway.
`-?, --help'
Display a help message and exit.
`-j big_tbl_name, --join=big_tbl_name'
Join all tables named on the command line into a single table
`big_tbl_name'. All tables that are to be combined MUST be
identical (same column names and types, same indexes, etc.).
`-p #, --packlength=#'
Specify the record length storage size, in bytes. The value
should be 1, 2, or 3. (`myisampack' stores all rows with length
pointers of 1, 2, or 3 bytes. In most normal cases, `myisampack'
can determine the right length value before it begins packing the
file, but it may notice during the packing process that it could
have used a shorter length. In this case, `myisampack' will print
a note that the next time you pack the same file, you could use a
shorter record length.)
`-s, --silent'
Silent mode. Write output only when errors occur.
`-t, --test'
Don't actually pack table, just test packing it.
`-T dir_name, --tmp_dir=dir_name'
Use the named directory as the location in which to write the
temporary table.
`-v, --verbose'
Verbose mode. Write information about progress and packing result.
`-V, --version'
Display version information and exit.
`-w, --wait'
Wait and retry if table is in use. If the `mysqld' server was
invoked with the `--skip-locking' option, it is not a good idea to
invoke `myisampack' if the table might be updated during the
packing process.
The sequence of commands shown below illustrates a typical table
compression session:
shell> ls -l station.*
-rw-rw-r-- 1 monty my 994128 Apr 17 19:00 station.MYD
-rw-rw-r-- 1 monty my 53248 Apr 17 19:00 station.MYI
-rw-rw-r-- 1 monty my 5767 Apr 17 19:00 station.frm
shell> myisamchk -dvv station
MyISAM file: station
Isam-version: 2
Creation time: 1996-03-13 10:08:58
Recover time: 1997-02-02 3:06:43
Data records: 1192 Deleted blocks: 0
Datafile: Parts: 1192 Deleted data: 0
Datafile pointer (bytes): 2 Keyfile pointer (bytes): 2
Max datafile length: 54657023 Max keyfile length: 33554431
Recordlength: 834
Record format: Fixed length
table description:
Key Start Len Index Type Root Blocksize Rec/key
1 2 4 unique unsigned long 1024 1024 1
2 32 30 multip. text 10240 1024 1
Field Start Length Type
1 1 1
2 2 4
3 6 4
4 10 1
5 11 20
6 31 1
7 32 30
8 62 35
9 97 35
10 132 35
11 167 4
12 171 16
13 187 35
14 222 4
15 226 16
16 242 20
17 262 20
18 282 20
19 302 30
20 332 4
21 336 4
22 340 1
23 341 8
24 349 8
25 357 8
26 365 2
27 367 2
28 369 4
29 373 4
30 377 1
31 378 2
32 380 8
33 388 4
34 392 4
35 396 4
36 400 4
37 404 1
38 405 4
39 409 4
40 413 4
41 417 4
42 421 4
43 425 4
44 429 20
45 449 30
46 479 1
47 480 1
48 481 79
49 560 79
50 639 79
51 718 79
52 797 8
53 805 1
54 806 1
55 807 20
56 827 4
57 831 4
shell> myisampack station.MYI
Compressing station.MYI: (1192 records)
- Calculating statistics
normal: 20 empty-space: 16 empty-zero: 12 empty-fill: 11
pre-space: 0 end-space: 12 table-lookups: 5 zero: 7
Original trees: 57 After join: 17
- Compressing file
87.14%
shell> ls -l station.*
-rw-rw-r-- 1 monty my 127874 Apr 17 19:00 station.MYD
-rw-rw-r-- 1 monty my 55296 Apr 17 19:04 station.MYI
-rw-rw-r-- 1 monty my 5767 Apr 17 19:00 station.frm
shell> myisamchk -dvv station
MyISAM file: station
Isam-version: 2
Creation time: 1996-03-13 10:08:58
Recover time: 1997-04-17 19:04:26
Data records: 1192 Deleted blocks: 0
Datafile: Parts: 1192 Deleted data: 0
Datafilepointer (bytes): 3 Keyfile pointer (bytes): 1
Max datafile length: 16777215 Max keyfile length: 131071
Recordlength: 834
Record format: Compressed
table description:
Key Start Len Index Type Root Blocksize Rec/key
1 2 4 unique unsigned long 10240 1024 1
2 32 30 multip. text 54272 1024 1
Field Start Length Type Huff tree Bits
1 1 1 constant 1 0
2 2 4 zerofill(1) 2 9
3 6 4 no zeros, zerofill(1) 2 9
4 10 1 3 9
5 11 20 table-lookup 4 0
6 31 1 3 9
7 32 30 no endspace, not_always 5 9
8 62 35 no endspace, not_always, no empty 6 9
9 97 35 no empty 7 9
10 132 35 no endspace, not_always, no empty 6 9
11 167 4 zerofill(1) 2 9
12 171 16 no endspace, not_always, no empty 5 9
13 187 35 no endspace, not_always, no empty 6 9
14 222 4 zerofill(1) 2 9
15 226 16 no endspace, not_always, no empty 5 9
16 242 20 no endspace, not_always 8 9
17 262 20 no endspace, no empty 8 9
18 282 20 no endspace, no empty 5 9
19 302 30 no endspace, no empty 6 9
20 332 4 always zero 2 9
21 336 4 always zero 2 9
22 340 1 3 9
23 341 8 table-lookup 9 0
24 349 8 table-lookup 10 0
25 357 8 always zero 2 9
26 365 2 2 9
27 367 2 no zeros, zerofill(1) 2 9
28 369 4 no zeros, zerofill(1) 2 9
29 373 4 table-lookup 11 0
30 377 1 3 9
31 378 2 no zeros, zerofill(1) 2 9
32 380 8 no zeros 2 9
33 388 4 always zero 2 9
34 392 4 table-lookup 12 0
35 396 4 no zeros, zerofill(1) 13 9
36 400 4 no zeros, zerofill(1) 2 9
37 404 1 2 9
38 405 4 no zeros 2 9
39 409 4 always zero 2 9
40 413 4 no zeros 2 9
41 417 4 always zero 2 9
42 421 4 no zeros 2 9
43 425 4 always zero 2 9
44 429 20 no empty 3 9
45 449 30 no empty 3 9
46 479 1 14 4
47 480 1 14 4
48 481 79 no endspace, no empty 15 9
49 560 79 no empty 2 9
50 639 79 no empty 2 9
51 718 79 no endspace 16 9
52 797 8 no empty 2 9
53 805 1 17 1
54 806 1 3 9
55 807 20 no empty 3 9
56 827 4 no zeros, zerofill(2) 2 9
57 831 4 no zeros, zerofill(1) 2 9
The information printed by `myisampack' is described below:
`normal'
The number of columns for which no extra packing is used.
`empty-space'
The number of columns containing values that are only spaces;
these will occupy 1 bit.
`empty-zero'
The number of columns containing values that are only binary 0's;
these will occupy 1 bit.
`empty-fill'
The number of integer columns that don't occupy the full byte
range of their type; these are changed to a smaller type (for
example, an `INTEGER' column may be changed to `MEDIUMINT').
`pre-space'
The number of decimal columns that are stored with leading spaces.
In this case, each value will contain a count for the number of
leading spaces.
`end-space'
The number of columns that have a lot of trailing spaces. In this
case, each value will contain a count for the number of trailing
spaces.
`table-lookup'
The column had only a small number of different values, which were
converted to an `ENUM' before Huffman compression.
`zero'
The number of columns for which all values are zero.
`Original trees'
The initial number of Huffman trees.
`After join'
The number of distinct Huffman trees left after joining trees to
save some header space.
After a table has been compressed, `myisamchk -dvv' prints additional
information about each field:
`Type'
The field type may contain the following descriptors:
`constant'
All rows have the same value.
`no endspace'
Don't store endspace.
`no endspace, not_always'
Don't store endspace and don't do end space compression for
all values.
`no endspace, no empty'
Don't store endspace. Don't store empty values.
`table-lookup'
The column was converted to an `ENUM'.
`zerofill(n)'
The most significant `n' bytes in the value are always 0 and
are not stored.
`no zeros'
Don't store zeros.
`always zero'
0 values are stored in 1 bit.
`Huff tree'
The Huffman tree associated with the field.
`Bits'
The number of bits used in the Huffman tree.
After you have run `pack_isam'/`myisampack' you must run
`isamchk'/`myisamchk' to re-create the index. At this time you can
also sort the index blocks and create statistics needed for the *MySQL*
optimizer to work more efficiently:
myisamchk -rq --analyze --sort-index table_name.MYI
isamchk -rq --analyze --sort-index table_name.ISM
After you have installed the packed table into the *MySQL* database
directory you should do `mysqladmin flush-tables' to force `mysqld' to
start using the new table.
Maintaining a MySQL Installation
********************************
Using `myisamchk' for Table Maintenance and Crash Recovery
==========================================================
Starting with *MySQL* Version 3.23.13, you can check MyISAM tables with
the `CHECK TABLE' command. *Note CHECK TABLE::. You can repair tables
with the `REPAIR TABLE' command. *Note REPAIR TABLE::.
To check/repair MyISAM tables (`.MYI' and `.MYD') you should use the
`myisamchk' utility. To check/repair ISAM tables (`.ISM' and `.ISD')
you should use the `isamchk' utility. *Note Table types::.
In the following text we will talk about `myisamchk', but everything
also applies to the old `isamchk'.
You can use the `myisamchk' utility to get information about your
database tables, check and repair them, or optimize them. The following
sections describe how to invoke `myisamchk' (including a description of
its options), how to set up a table maintenance schedule, and how to
use `myisamchk' to perform its various functions.
You can, in most cases, also use the command `OPTIMIZE TABLES' to
optimize and repair tables, but this is not as fast or reliable (in case
of real fatal errors) as `myisamchk'. On the other hand, `OPTIMIZE
TABLE' is easier to use and you don't have to worry about flushing
tables. *Note `OPTIMIZE TABLE': OPTIMIZE TABLE.
Even that the repair in `myisamchk' is quite secure, it's always a good
idea to make a backup BEFORE doing a repair (or anything that could
make a lot of changes to a table)
`myisamchk' Invocation Syntax
-----------------------------
`myisamchk' is invoked like this:
shell> myisamchk [options] tbl_name
The `options' specify what you want `myisamchk' to do. They are
described below. (You can also get a list of options by invoking
`myisamchk --help'.) With no options, `myisamchk' simply checks your
table. To get more information or to tell `myisamchk' to take
corrective action, specify options as described below and in the
following sections.
`tbl_name' is the database table you want to check/repair. If you run
`myisamchk' somewhere other than in the database directory, you must
specify the path to the file, because `myisamchk' has no idea where your
database is located. Actually, `myisamchk' doesn't care whether or not
the files you are working on are located in a database directory; you
can copy the files that correspond to a database table into another
location and perform recovery operations on them there.
You can name several tables on the `myisamchk' command line if you
wish. You can also specify a name as an index file name (with the
`.MYI' suffix), which allows you to specify all tables in a directory
by using the pattern `*.MYI'. For example, if you are in a database
directory, you can check all the tables in the directory like this:
shell> myisamchk *.MYI
If you are not in the database directory, you can check all the tables
there by specifying the path to the directory:
shell> myisamchk /path/to/database_dir/*.MYI
You can even check all tables in all databases by specifying a wild card
with the path to the *MySQL* data directory:
shell> myisamchk /path/to/datadir/*/*.MYI
The recommended way to quickly check all tables is:
myisamchk --silent --fast /path/to/datadir/*/*.MYI
isamchk --silent /path/to/datadir/*/*.ISM
If you want to check all tables and repair all tables that are
corrupted, you can use the following line:
myisamchk --silent --force --fast --update-state -O key_buffer=64M -O sort_buffer=64M -O read_buffer=1M -O write_buffer=1M /path/to/datadir/*/*.MYI
isamchk --silent --force -O key_buffer=64M -O sort_buffer=64M -O read_buffer=1M -O write_buffer=1M /path/to/datadir/*/*.ISM
The above assumes that you have more than 64 M free.
Note that if you get an error like:
myisamchk: warning: 1 clients is using or hasn't closed the table properly
This means that you are trying to check a table that has been updated by
the another program (like the mysqld server) that hasn't yet closed the
file or that has died without closing the file properly.
If you `mysqld' is running, you must force a sync/close of all tables
with `FLUSH TABLES' and ensure that no one is using the tables while
you are running `myisamchk'. In *MySQL* Version 3.23 the easiest way
to avoid this problem is to use `CHECK TABLE' instead of `myisamchk' to
check tables.
General Options for `myisamchk'
...............................
`myisamchk' supports the following options.
`-# or --debug=debug_options'
Output debug log. The `debug_options' string often is
`'d:t:o,filename''.
`-? or --help'
Display a help message and exit.
`-O var=option, --set-variable var=option'
Set the value of a variable. The possible variables and their
default values for myisamchk can be examined with `myisamchk
--help':
key_buffer_size 523264
read_buffer_size 262136
write_buffer_size 262136
sort_buffer_size 2097144
sort_key_blocks 16
decode_bits 9
`sort_buffer_size' is used when the keys are reparied by sorting
keys, which is the normal case when you use `--recover'.
`key_buffer_size' is used when you are checking the table with
`--extended-check' or when the keys are repaired by inserting key
row by row in to the table (like when doing normal inserts).
Repairing through the key buffer is used in the following cases:
* If you use `--safe-recover'.
* If you are using a `FULLTEXT' index.
* If the temporary files needed to sort the keys would be more
than twice as big as when creating the key file directly.
This is often the case when you have big `CHAR', `VARCHAR' or
`TEXT' keys as the sort needs to store the whole keys during
sorting. If you have lots of temporary space and you can
force `myisamchk' to repair by sorting you can use the
`--sort-recover' option.
Reparing through the key buffer takes much less disk space than
using sorting, but is also much slower.
If you want a faster repair, set the above variables to about 1/4
of your available memory. You can set both variables to big
values, as only one of the above buffers will be used at a time.
`-s or --silent'
Silent mode. Write output only when errors occur. You can use `-s'
twice (`-ss') to make `myisamchk' very silent.
`-v or --verbose'
Verbose mode. Print more information. This can be used with `-d'
and `-e'. Use `-v' multiple times (`-vv', `-vvv') for more
verbosity!
`-V or --version'
Print the `myisamchk' version and exit.
`-w or, --wait'
Instead of giving an error if the table is locked, wait until the
table is unlocked before continuing. Note that if you are running
`mysqld' on the table with `--skip-locking', the table can only be
locked by another `myisamchk' command.
Check Options for `myisamchk'
.............................
`-c or --check'
Check table for errors. This is the default operation if you are
not giving `myisamchk' any options that override this.
`-e or --extend-check'
Check the table VERY thoroughly (which is quite slow if you have
many indexes). This option should only be used in extreme cases.
Normally, `myisamchk' or `myisamchk --medium-check' should, in most
cases, be able to find out if there are any errors in the table.
If you are using `--extended-check' and have much memory, you
should increase the value of `key_buffer_size' a lot!
`-F or --fast'
Check only tables that haven't been closed properly.
`-C or --check-only-changed'
Check only tables that have changed since the last check.
`-f or --force'
Restart `myisamchk' with `-r' (repair) on the table, if
`myisamchk' finds any errors in the table.
`-i or --information'
Print informational statistics about the table that is checked.
`-m or --medium-check'
Faster than extended-check, but only finds 99.99% of all errors.
Should, however, be good enough for most cases.
`-U or --update-state'
Store in the `.MYI' file when the table was checked and if the
table crashed. This should be used to get full benefit of the
`--check-only-changed' option, but you shouldn't use this option
if the `mysqld' server is using the table and you are running
`mysqld' with `--skip-locking'.
`-T or --read-only'
Don't mark table as checked. This is useful if you use `myisamchk'
to check a table that is in use by some other application that
doesn't use locking (like `mysqld --skip-locking').
Repair Options for myisamchk
............................
The following options are used if you start `myisamchk' with `-r' or
`-o':
`-D # or --data-file-length=#'
Max length of data file (when re-creating data file when it's
'full').
`-e or --extend-check'
Try to recover every possible row from the data file. Normally
this will also find a lot of garbage rows. Don't use this option
if you are not totally desperate.
`-f or --force'
Overwrite old temporary files (`table_name.TMD') instead of
aborting.
`-k # or keys-used=#'
If you are using ISAM, tells the ISAM table handler to update only
the first `#' indexes. If you are using `MyISAM', tells which keys
to use, where each binary bit stands for one key (first key is bit
0). This can be used to get faster inserts! Deactivated indexes
can be reactivated by using `myisamchk -r'. keys.
`-l or --no-symlinks'
Do not follow symbolic links. Normally `myisamchk' repairs the
table a symlink points at.
`-r or --recover'
Can fix almost anything except unique keys that aren't unique
(which is an extremely unlikely error with ISAM/MyISAM tables).
If you want to recover a table, this is the option to try first.
Only if myisamchk reports that the table can't be recovered by
`-r', you should then try `-o'. (Note that in the unlikely case
that `-r' fails, the data file is still intact.) If you have lots
of memory, you should increase the size of `sort_buffer_size'!
`-o or --safe-recover'
Uses an old recovery method (reads through all rows in order and
updates all index trees based on the found rows); this is a
magnitude slower than `-r', but can handle a couple of very
unlikely cases that `-r' cannot handle. This recovery method also
uses much less disk space than `-r'. Normally one should always
first repair with `-r', and only if this fails use `-o'.
If you have lots of memory, you should increase the size of
`key_buffer_size'!
`-n or --sort-recover'
Force `myisamchk' to use sorting to resolve the keys even if the
temporary files should be very big. This will not have any effect
if you have fulltext keys in the table.
`--character-sets-dir=...'
Directory where character sets are stored.
`--set-character-set=name'
Change the character set used by the index
`.t or --tmpdir=path'
Path for storing temporary files. If this is not set, `myisamchk'
will use the environment variable `TMPDIR' for this.
`-q or --quick'
Faster repair by not modifying the data file. One can give a second
`-q' to force `myisamchk' to modify the original datafile in case
of duplicate keys
`-u or --unpack'
Unpack file packed with myisampack.
Other Options for `myisamchk'
.............................
Other actions that `myisamchk' can do, besides repair and check tables:
`-a or --analyze'
Analyze the distribution of keys. This improves join performance by
enabling the join optimizer to better choose in which order it
should join the tables and which keys it should use: `myisamchk
--describe --verbose table_name'' or using `SHOW KEYS' in *MySQL*.
`-d or --description'
Prints some information about table.
`-A or --set-auto-increment[=value]'
Force auto_increment to start at this or higher value. If no value
is given, then sets the next auto_increment value to the highest
used value for the auto key + 1.
`-S or --sort-index'
Sort the index tree blocks in high-low order. This will optimize
seeks and will make table scanning by key faster.
`-R or --sort-records=#'
Sorts records according to an index. This makes your data much
more localized and may speed up ranged `SELECT' and `ORDER BY'
operations on this index. (It may be VERY slow to do a sort the
first time!) To find out a table's index numbers, use `SHOW
INDEX', which shows a table's indexes in the same order that
`myisamchk' sees them. Indexes are numbered beginning with 1.
`myisamchk' Memory Usage
------------------------
Memory allocation is important when you run `myisamchk'. `myisamchk'
uses no more memory than you specify with the `-O' options. If you are
going to use `myisamchk' on very large files, you should first decide
how much memory you want it to use. The default is to use only about
3M to fix things. By using larger values, you can get `myisamchk' to
operate faster. For example, if you have more than 32M RAM, you could
use options such as these (in addition to any other options you might
specify):
shell> myisamchk -O sort=16M -O key=16M -O read=1M -O write=1M ...
Using `-O sort=16M' should probably be enough for most cases.
Be aware that `myisamchk' uses temporary files in `TMPDIR'. If `TMPDIR'
points to a memory file system, you may easily get out of memory
errors. If this happens, set `TMPDIR' to point at some directory with
more space and restart `myisamchk'.
When repairing, `myisamchk' will also nead a lot of disk space:
* Double the size of the record file (the original one and a copy).