Why Indexes Matter

Without indexes, InnoDB performs a full table scan — reading every row to find matches. Indexes create sorted B-tree structures (and other types) that let MySQL locate rows in O(log n) time.

Indexes speed reads but slow writes (each INSERT/UPDATE must update index pages). Balance index count against query patterns.

Index Types

  -- B-tree index (default)
CREATE INDEX idx_users_email ON users(email);

-- Composite index (column order matters)
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at, status);

-- Unique index
CREATE UNIQUE INDEX uk_users_email ON users(email);

-- Full-text index
CREATE FULLTEXT INDEX ft_products ON products(name, description);

-- Functional index (MySQL 8.0+)
CREATE INDEX idx_lower_email ON users ((LOWER(email)));
  

Composite Index Column Order

The leftmost prefix rule: a composite index (a, b, c) supports queries filtering on a, (a, b), or (a, b, c) — but not b alone or (b, c).

Optimal order:

  1. Equality filters (=, IN) first
  2. Range filters (>, <, BETWEEN) next
  3. Sort columns (ORDER BY) last if compatible
  CREATE INDEX idx_orders ON orders(user_id, status, created_at);

-- Uses index fully
SELECT * FROM orders WHERE user_id = 5 AND status = 'completed' ORDER BY created_at DESC;

-- Uses index partially (user_id only)
SELECT * FROM orders WHERE user_id = 5 ORDER BY created_at DESC;
  

EXPLAIN and EXPLAIN ANALYZE

  EXPLAIN SELECT * FROM users WHERE email = '[email protected]'\G

EXPLAIN ANALYZE
SELECT o.id, o.total FROM orders o
WHERE o.user_id = 100 AND o.created_at > '2024-01-01';
  
Column Good Value Bad Value
type const, eq_ref, ref, range ALL, index
key index name NULL
rows Low estimate High estimate
Extra Using index Using filesort, Using temporary
  -- Using index = covering index (no table lookup needed)
-- Using filesort = sort not satisfied by index
-- Using temporary = temp table created (often GROUP BY/DISTINCT)
  

Covering Indexes

When all queried columns exist in the index, MySQL reads only the index:

  CREATE INDEX idx_cover ON orders(user_id, created_at, total, status);

SELECT created_at, total, status FROM orders
WHERE user_id = 5 AND created_at > '2024-01-01';
-- EXPLAIN Extra: Using index
  

Index Anti-Patterns

  -- Bad: function on indexed column
SELECT * FROM orders WHERE YEAR(created_at) = 2024;

-- Good: range query
SELECT * FROM orders
WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';

-- Bad: leading wildcard LIKE
SELECT * FROM users WHERE email LIKE '%@example.com';

-- Good: full-text or prefix
SELECT * FROM users WHERE email LIKE 'alice%';

-- Bad: implicit conversion
SELECT * FROM users WHERE id = '123abc';  -- string to int conversion
  

Index Maintenance

  -- Find unused indexes
SELECT * FROM sys.schema_unused_indexes WHERE object_schema = 'myapp';

-- Find redundant indexes
SELECT * FROM sys.schema_redundant_indexes WHERE table_schema = 'myapp';

-- Rebuild index (InnoDB online DDL)
ALTER TABLE orders DROP INDEX idx_old, ADD INDEX idx_new (user_id, created_at);

ANALYZE TABLE orders;
OPTIMIZE TABLE orders;  -- rebuilds table — use during maintenance windows
  

Query Rewriting Patterns

  -- OR on different columns — often no index
SELECT * FROM users WHERE email = '[email protected]' OR phone = '555-1234';

-- Rewrite as UNION
SELECT * FROM users WHERE email = '[email protected]'
UNION
SELECT * FROM users WHERE phone = '555-1234';

-- N+1 in application — fix with JOIN
SELECT u.name, o.total
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.status = 'active';
  

Slow Query Log

  [mysqld]
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 0.5
log_queries_not_using_indexes = 1
min_examined_row_limit = 1000
  
  pt-query-digest /var/log/mysql/slow.log --since '2024-06-01' --until '2024-06-02'
  

Best Practices

  1. Index foreign keys and columns in WHERE, JOIN, ORDER BY
  2. Keep indexes narrow — fewer columns, smaller types
  3. Review index usage quarterly — drop unused indexes
  4. Test with production-like data volumes — indexes behave differently at scale
  5. Use EXPLAIN ANALYZE before and after index changes

Common Mistakes

Mistake Impact
Index every column Write amplification, buffer pool waste
Wrong composite column order Index not used for common queries
Over-reliance on query cache (removed in 8.0) Misunderstanding caching layers
Adding index without measuring May not help; always verify with EXPLAIN

Troubleshooting

Index not used despite existing:

  • Statistics stale → ANALYZE TABLE
  • Implicit type conversion → match column types
  • Optimizer chooses full scan (small table) → FORCE INDEX only as last resort
  SELECT * FROM orders FORCE INDEX (idx_orders_user_date)
WHERE user_id = 5 ORDER BY created_at DESC;
  

Production Scenario

A checkout query scanning 2M orders timed out. EXPLAIN showed type=ALL. Adding (user_id, status, created_at) composite index reduced execution from 4.2s to 3ms. The team added an CI check that runs EXPLAIN on critical queries against a staging dataset.

Proper indexing is the highest-ROI MySQL optimization — measure first, index second, verify always.