7. Column Index Type Definitions
• Column Index
– An index on a single column
• Compound Index
– An index on multiple columns
• Covering Index
– “Covers” all columns in a query
• Partial Index
– A subset of a column for the index
• E.g. Only the first 10 characters of a person’s name
8. Compound Index
• CREATE TABLE test (
id INT NOT NULL,
last_name CHAR(30) NOT NULL,
first_name CHAR(30) NOT NULL,
PRIMARY KEY (id),
INDEX name(last_name,first_name) );
• The name index is an index over the
last_name and first_name columns
9. What does that mean?
• Queries like:
– SELECT * FROM test WHERE first_name=‘Aris’ AND
last_name=‘Zakinthinos’;
– SELECT * FROM test WHERE last_name=‘York’;
– Will use the index
• But a query like:
– SELECT * FROM test WHERE first_name=‘Zak’;
– Will not
10. How Compound Indexes are Used
• If you have an index on (col1, col2,
col3)
• This Index will be used on queries for (col1),
(col1, col2) and (col1, col2, col3)
– Notice that the leftmost prefix must exist
• Only the col1 part of the index will be used for
queries for (col1, col3)
• This Index will not be used for queries for
(col2), (col3) and (col2, col3)
11. One Thing to Keep in Mind
• Remember that the index stores things in
sorted order so an n-field compound index is
the equivalent of sorting the data on n fields
• For example, for a 2 column index:
COL1 COL2
A 4
Z 3
A 5
Z 1
Index
(A,4)
(A,5)
(Z,1)
(Z,3)
12. Pro Tip
• MySQL always adds the primary key to the
end of your index
– You never have to add it to the end of your
compound key
13. Can you have too many indexes?
• YES!
• They take up space
– You want all your indexes to fit in memory
• They make inserts/deletes slower
– Remember that you need to insert/delete
into/from each index
16. What does that mean?
• How many tables are used
• How tables are joined
• How data is looked-up
• Possible and actual index use
• Length of index used
• Approximate number of rows examined
17. Why should I care?
• Ultimately, less server load leads to a better
user experience
– Could be the difference between usable and
bankrupt
• Impress your friends
• Get better jobs
18. Which queries should I examine?
• Every single one!
• If you have never used EXPLAIN:
– Start by looking at the items in the slow query log
– Or, execute SHOW FULL PROCESSLIST every
once in a while and grab a query that you see very
often
19. Pro Tip
• Using EXPLAIN during QA is better than in
production
• It avoids you having to say:
– “It’s not a problem—we call it the coffee break
feature.”
20. Can I run it on any query?
• Up to MySQL 5.6, it only worked on SELECT
queries
21. Anything else I should know?
• It doesn’t execute your query but MAY
execute portions of it. CAUTION!
– Nested subqueries are executed
23. Test Database
• We will be using MySQL’s sakalia test
database, available at:
http://dev.MySQL.com/doc/index-other.html
• It is a sample database for a DVD rental store
24. How do you use Explain?
• To get MySQL’s execution plan, simply put the
word ‘EXPLAIN’ in front of your select
statement:
EXPLAIN SELECT * FROM film f
JOIN film_actor fa ON fa.film_id = f.film_id
JOIN actor a ON a.actor_id = fa.actor_id;
28. Column Definitions
A sequential ID identifying the select in the query.
EXPLAIN SELECT * FROM film f
JOIN film_actor fa ON fa.film_id = f.film_id
JOIN actor a ON a.actor_id = fa.actor_id;
29. Column Definitions
The type of select you are doing. Common values:
• SIMPLE – Simple SELECT (not using UNION or subqueries)
• PRIMARY – Outermost SELECT
• UNION – Second or later SELECT in a UNION
• SUBQUERY – First SELECT in a subquery
• DEPENDENT SUBQUERY - First SELECT in subquery, dependent on outer query
• DERIVED - Derived table select (subquery in FROM clause)
30. Column Definitions
The Table or Alias this row refers to.
EXPLAIN SELECT * FROM film f
JOIN film_actor fa ON fa.film_id = f.film_id
JOIN actor a ON a.actor_id = fa.actor_id;
31. Column Definitions
The join type. Lots more to come on this.
EXPLAIN SELECT * FROM film f
JOIN film_actor fa ON fa.film_id = f.film_id
JOIN actor a ON a.actor_id = fa.actor_id;
32. Column Definitions
The possible indexes that could be used.
If NULL then no appropriate index was found.
EXPLAIN SELECT * FROM film f
JOIN film_actor fa ON fa.film_id = f.film_id
JOIN actor a ON a.actor_id = fa.actor_id;
33. Column Definitions
The Index that was used.
EXPLAIN SELECT * FROM film f
JOIN film_actor fa ON fa.film_id = f.film_id
JOIN actor a ON a.actor_id = fa.actor_id;
34. Column Definitions
The number of bytes MySQL uses from the index.
EXPLAIN SELECT * FROM film f
JOIN film_actor fa ON fa.film_id = f.film_id
JOIN actor a ON a.actor_id = fa.actor_id;
35. Column Definitions
The columns (or constants) form the index that are used.
EXPLAIN SELECT * FROM film f
JOIN film_actor fa ON fa.film_id = f.film_id
JOIN actor a ON a.actor_id = fa.actor_id;
36. Column Definitions
The approximate number of rows examined.
Note: this is just a guide.
EXPLAIN SELECT * FROM film f
JOIN film_actor fa ON fa.film_id = f.film_id
JOIN actor a ON a.actor_id = fa.actor_id;
37. Column Definitions
Information on how the tables are join.
Lots more to come on this!
EXPLAIN SELECT * FROM film f
JOIN film_actor fa ON fa.film_id = f.film_id
JOIN actor a ON a.actor_id = fa.actor_id;
39. Join Type – system/const
• At most one row returned
– E.g., … WHERE id=1
– Constant can be propagated through query
– Index must be either PRIMARY or UNIQUE
EXPLAIN SELECT * FROM rental WHERE rental_id = 10;
40. Join Type – eq_ref
• Index lookup returns exactly one row
– E.g., WHERE a.id=b.id
– Requires unique index on all parts of the key used
by the join
EXPLAIN SELECT * FROM customer c
JOIN address a ON c.address_id = a.address_id
41. Join Type – ref
• Index lookup that can return more than one
row
– Used when
• Either leftmost part of a unique key is used
• Or a non-unique or non-null key is used
EXPLAIN SELECT * FROM rental WHERE rental_id IN (10,11,12)
AND rental_date = '2006-02-01';
42. Join Type– ref_or_null
• Similar to ref but allows for null values or null
conditions
– Essentially an extra pass to look for nulls
EXPLAIN SELECT * FROM film WHERE
release_year = 2006 OR release_year IS NULL;
43. Join Type – index_merge
• Uses 2 separate indexes
– Extra field shows more info
– Can be one of:
• sort_union – OR condition on non-primary key fields
• union – OR condition using constants or ranges on primary
key fields
• intersection – AND condition with constants or range
conditions on primary key fields
EXPLAIN SELECT * FROM rental WHERE rental_id IN (10,11,12)
OR rental_date = '2006-02-01';
44. Join Type – range
• Access method for a range value in the where
clause (<, <=, >, >=, LIKE, IN or BETWEEN,
IN())
– Performs a partial index scan
– Lots of optimizations for this type of query
EXPLAIN SELECT * FROM rental WHERE rental_date
BETWEEN '2006-01-01' AND '2006-07-01';
45. Join Type – index
• Does an index scan
– You are doing a full scan of every record in the
index
– Better than ‘ALL’ but still requires a LOT of
resources
– Note: This is not the same as ‘USING INDEX’ in
Extras
EXPLAIN SELECT rental_date FROM rental;
46. Join Type – ALL
• Full table scan
– It will look at every record in the table
– Unless you want the whole table, it should be
avoided for all but the smallest of tables
EXPLAIN SELECT * FROM rental;
47. Join Type Summary
• From best to worst
– system/const
– eq_ref
– ref
– ref_or_null
– index_merge
– range
– index
– ALL
49. A More Complicated Query
• All identical id values are part of the same select
• This query has a bunch of UNIONs
• You can also see all of the join types used by this Query
51. Execution Order
• First thing to notice is that the order of execution is not the same as the query
• MySQL will rearrange your query to do what it thinks is optimal
• You will often disagree!
EXPLAIN SELECT * FROM film f
JOIN film_actor fa ON fa.film_id = f.film_id
JOIN actor a ON a.actor_id = fa.actor_id;
52. Query Processing
The above can be read as:
for (each row in table a [Actor] ) {
for (each row in fa [film_actor] matching a.actor_id) {
for the row in f [film] matching fa.film_id ) {
Add row to output
}
}
} EXPLAIN SELECT * FROM film f
JOIN film_actor fa ON fa.film_id = f.film_id
JOIN actor a ON a.actor_id = fa.actor_id;
53. Query Processing In General
for (each row in outer table matching where clause){
…
for (each row in inner table matching key value
and where clause) {
Add to output
}
}
54. rows is a critical performance indicator
• Total approximate rows is the PRODUCT of all
the rows with the same ID
• It is only an estimate
• This query estimated a total of 68,640 rows
• Actual rows examined 2,969,639
55. Remember: it is an estimate
Remember this query: Here is the rest:
This query executes in less than
200 ms because there are limit
clauses that only return a small
number of rows.
56. key_len
• This will tell you how much of the index it will use
– Really only useful if you have compound keys
• For a complete list of type to byte mapping, see:
– http://dev.mysql.com/doc/refman/5.6/en/storage-
requirements.html
• What’s missing from that page:
– VARCHAR(n)will always use a 2 byte length fields
– If a column can be NULL it takes one extra byte in the
index
• Warning! Multibyte characters make byte!=character
– UTF8 is 3 bytes
57. key_len example
EXPLAIN SELECT film_id, title FROM film WHERE description LIKE
'A Epic%' AND release_year=2006 AND language_id = 1;
CREATE TABLE `film` (
...
`description` text,
`release_year` year(4) DEFAULT NULL,
`language_id` tinyint(3) unsigned NOT NULL,
...
KEY `idx_compound_key` (`language_id`,`release_year`,`description`(20)),
) ENGINE=InnoDB AUTO_INCREMENT=1001 DEFAULT CHARSET=utf8;
tinyint – 1 byte year – 1 byte + 1 byte possible NULL
Description 20*3 bytes+2 byte
length+1 byte possible NULL
59. Pay Close Attention
• This column will give you a good sense of
what is going to happen during execution
60. Extra Column – The Bad
• Using temporary
– During execution, a temporary table was required
– Not horrible if the temporary table is small since it
will sit in memory
61. Extra Column – The Bad
• Using filesort
– Sorting was needed rather than using an index
– Not horrible if the result set is small
62. Extra Column – The Good
• Using index
– The query was satisfied using only an index
63. Extra Column – The Whatever
• Using where
– A where clause is used to restrict rows
– Unless you also see ‘Using Index,’ this means that
MySQL had to read the row from the database to
apply the where clause
65. Signs That Your Query Stinks
• No index is used – NULL in key column
• Large number of rows estimated
• Using temporary
• Using filesort
• Using derived tables – DERIVED in
select_type column
• Joining on derived tables
• Having dependent subqueries on a large result
set
66. Examples:
EXPLAIN SELECT * FROM film f WHERE release_year = 2006;
ALTER TABLE `sakila`.`film`
ADD INDEX `idx_release_year` (`release_year` ASC) ;
67. Examples:
EXPLAIN SELECT rating, COUNT(*) FROM film
WHERE rental_rate <1.00
GROUP BY rating;
• This is a terrible query.
• A full table scan that creates a temporary table and then sorts it.
• Why?
68. Grouping
• To execute this query, MySQL will build a
temporary table with all the rows that have
rental_rate <1.00
• It will then sort them by rating
• It will then count all of the items that have the
same rating
• To make this query fast you need all of the
ratings to be processed in order.
– That is, you have to avoid the build and sort step
69. Adding an Index
ALTER TABLE `sakila`.`film` ADD INDEX `idx_rating`
(`rating` ASC, `rental_rate` ASC) ;
EXPLAIN SELECT rating, COUNT(*) FROM film
WHERE rental_rate <1.00
GROUP BY rating;
• This works because MySQL can process all the rows with the same ‘rating’ sequentially.
• The second part of the index allows it to check the where clause from the index directly.
70. Optimizing GROUP BY
• All columns used in the GROUP BY must
come from the same index and the index must
store the keys in the order specified in the
GROUP BY
• If this isn’t true you will see a “Using
filesort” and/or “Using temporary”
71. Optimizing ORDER BY
• An index can be used with ORDER BY even
if the index doesn’t match the ORDER BY
columns exactly as long as the “missing” keys
are constants in the WHERE clause
–The order of the keys must match the order
of the ORDER BY clause
• If this isn’t true you will see a “Using
filesort” and/or “Using temporary”
72. Both GROUP BY and ORDER BY
EXPLAIN SELECT rating, COUNT(*) AS c FROM film
WHERE rental_rate <1.00
GROUP by rating
ORDER BY c;
• There is no way to avoid the ‘Using temporary’ and the ‘Using filesort’
• You are asking MySQL to first sort by rating to process the GROUP BY which it does using
an index.
• Then you are asking it to take that result and resort it from a computed field.
• An ORDER BY column list which is in a different order than the GROUP BY list will cause
problem.
73. What if I can’t make it better?
• Never give up!
– There is always a solution. You might not want to
do it, but there is always a solution.
• You might need to:
– Denormalize your data to allow you to construct a
better compound index
– Cache outside the database
– Pull some of the processing into your application
– Break up the query into smaller faster chunks
75. Test with Production Data
• Do not optimize with a subset of your data
• MySQL uses table statistics to determine its
execution plan
• If you optimize with test data your real world
performance might be completely different
76. PRACTICE!
• Like everything, you need to do it, to fully
understand it
• Don’t expect to be a MySQL ninja overnight
• It takes hard work but the benefits are worth
it
– 10 second searches optimized to 100 milliseconds