Indexes and Query Optimization
Why Indexes Matter
Without indexes, InnoDB performs a full table scan — reading every row to find matches. Indexes create sorted B-tree structures (and other types) that let MySQL locate rows in O(log n) time.
Indexes speed reads but slow writes (each INSERT/UPDATE must update index pages). Balance index count against query patterns.
Index Types
-- B-tree index (default)
CREATE INDEX idx_users_email ON users(email);
-- Composite index (column order matters)
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at, status);
-- Unique index
CREATE UNIQUE INDEX uk_users_email ON users(email);
-- Full-text index
CREATE FULLTEXT INDEX ft_products ON products(name, description);
-- Functional index (MySQL 8.0+)
CREATE INDEX idx_lower_email ON users ((LOWER(email)));
Composite Index Column Order
The leftmost prefix rule: a composite index (a, b, c) supports queries filtering on a, (a, b), or (a, b, c) — but not b alone or (b, c).
Optimal order:
- Equality filters (
=,IN) first - Range filters (
>,<,BETWEEN) next - Sort columns (
ORDER BY) last if compatible
CREATE INDEX idx_orders ON orders(user_id, status, created_at);
-- Uses index fully
SELECT * FROM orders WHERE user_id = 5 AND status = 'completed' ORDER BY created_at DESC;
-- Uses index partially (user_id only)
SELECT * FROM orders WHERE user_id = 5 ORDER BY created_at DESC;
EXPLAIN and EXPLAIN ANALYZE
EXPLAIN SELECT * FROM users WHERE email = '[email protected]'\G
EXPLAIN ANALYZE
SELECT o.id, o.total FROM orders o
WHERE o.user_id = 100 AND o.created_at > '2024-01-01';
| Column | Good Value | Bad Value |
|---|---|---|
| type | const, eq_ref, ref, range | ALL, index |
| key | index name | NULL |
| rows | Low estimate | High estimate |
| Extra | Using index | Using filesort, Using temporary |
-- Using index = covering index (no table lookup needed)
-- Using filesort = sort not satisfied by index
-- Using temporary = temp table created (often GROUP BY/DISTINCT)
Covering Indexes
When all queried columns exist in the index, MySQL reads only the index:
CREATE INDEX idx_cover ON orders(user_id, created_at, total, status);
SELECT created_at, total, status FROM orders
WHERE user_id = 5 AND created_at > '2024-01-01';
-- EXPLAIN Extra: Using index
Index Anti-Patterns
-- Bad: function on indexed column
SELECT * FROM orders WHERE YEAR(created_at) = 2024;
-- Good: range query
SELECT * FROM orders
WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';
-- Bad: leading wildcard LIKE
SELECT * FROM users WHERE email LIKE '%@example.com';
-- Good: full-text or prefix
SELECT * FROM users WHERE email LIKE 'alice%';
-- Bad: implicit conversion
SELECT * FROM users WHERE id = '123abc'; -- string to int conversion
Index Maintenance
-- Find unused indexes
SELECT * FROM sys.schema_unused_indexes WHERE object_schema = 'myapp';
-- Find redundant indexes
SELECT * FROM sys.schema_redundant_indexes WHERE table_schema = 'myapp';
-- Rebuild index (InnoDB online DDL)
ALTER TABLE orders DROP INDEX idx_old, ADD INDEX idx_new (user_id, created_at);
ANALYZE TABLE orders;
OPTIMIZE TABLE orders; -- rebuilds table — use during maintenance windows
Query Rewriting Patterns
-- OR on different columns — often no index
SELECT * FROM users WHERE email = '[email protected]' OR phone = '555-1234';
-- Rewrite as UNION
SELECT * FROM users WHERE email = '[email protected]'
UNION
SELECT * FROM users WHERE phone = '555-1234';
-- N+1 in application — fix with JOIN
SELECT u.name, o.total
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.status = 'active';
Slow Query Log
[mysqld]
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 0.5
log_queries_not_using_indexes = 1
min_examined_row_limit = 1000
pt-query-digest /var/log/mysql/slow.log --since '2024-06-01' --until '2024-06-02'
Best Practices
- Index foreign keys and columns in WHERE, JOIN, ORDER BY
- Keep indexes narrow — fewer columns, smaller types
- Review index usage quarterly — drop unused indexes
- Test with production-like data volumes — indexes behave differently at scale
- Use
EXPLAIN ANALYZEbefore and after index changes
Common Mistakes
| Mistake | Impact |
|---|---|
| Index every column | Write amplification, buffer pool waste |
| Wrong composite column order | Index not used for common queries |
| Over-reliance on query cache (removed in 8.0) | Misunderstanding caching layers |
| Adding index without measuring | May not help; always verify with EXPLAIN |
Troubleshooting
Index not used despite existing:
- Statistics stale →
ANALYZE TABLE - Implicit type conversion → match column types
- Optimizer chooses full scan (small table) →
FORCE INDEXonly as last resort
SELECT * FROM orders FORCE INDEX (idx_orders_user_date)
WHERE user_id = 5 ORDER BY created_at DESC;
Production Scenario
A checkout query scanning 2M orders timed out. EXPLAIN showed type=ALL. Adding (user_id, status, created_at) composite index reduced execution from 4.2s to 3ms. The team added an CI check that runs EXPLAIN on critical queries against a staging dataset.
Proper indexing is the highest-ROI MySQL optimization — measure first, index second, verify always.