Table Partitioning
Why Partition?
Partitioning splits one logical table into multiple physical segments. MySQL can prune partitions — skip entire segments — when queries filter on the partition key. This improves query speed, simplifies archival (drop old partitions), and helps manage tables that exceed billions of rows.
Partitioning is not a substitute for indexes. It helps when queries consistently filter on the partition column (usually date or region).
Partition Types
| Type | Partition Key | Use Case |
|---|---|---|
| RANGE | Numeric or date ranges | Time-series logs, monthly data |
| LIST | Explicit value lists | Regional data (US, EU, APAC) |
| HASH | Hash of column | Even distribution without range queries |
| KEY | MySQL internal hash | Similar to HASH, supports multi-column |
RANGE Partitioning by Date
CREATE TABLE access_logs (
id BIGINT NOT NULL AUTO_INCREMENT,
logged_at DATETIME NOT NULL,
user_id INT,
path VARCHAR(512),
status_code SMALLINT,
PRIMARY KEY (id, logged_at)
)
PARTITION BY RANGE (TO_DAYS(logged_at)) (
PARTITION p202401 VALUES LESS THAN (TO_DAYS('2024-02-01')),
PARTITION p202402 VALUES LESS THAN (TO_DAYS('2024-03-01')),
PARTITION p202403 VALUES LESS THAN (TO_DAYS('2024-04-01')),
PARTITION p_future VALUES LESS THAN MAXVALUE
);
Critical rule: every unique index (including PRIMARY KEY) must include all columns in the partition expression.
LIST Partitioning
CREATE TABLE sales (
id INT NOT NULL,
region ENUM('US', 'EU', 'APAC') NOT NULL,
amount DECIMAL(10,2),
PRIMARY KEY (id, region)
)
PARTITION BY LIST COLUMNS(region) (
PARTITION p_us VALUES IN ('US'),
PARTITION p_eu VALUES IN ('EU'),
PARTITION p_apac VALUES IN ('APAC')
);
HASH Partitioning
CREATE TABLE sessions (
session_id CHAR(64) NOT NULL,
user_id INT,
data JSON,
PRIMARY KEY (session_id)
)
PARTITION BY HASH(session_id)
PARTITIONS 8;
HASH spreads rows evenly but does not support efficient range queries on session_id. Use when access is always by exact key.
Partition Pruning
EXPLAIN SELECT * FROM access_logs
WHERE logged_at >= '2024-02-01' AND logged_at < '2024-03-01';
Look for partitions: p202402 in EXPLAIN — only one partition is scanned. Without a filter on logged_at, all partitions are scanned (same cost as an unpartitioned table).
-- Bad: function on partition column prevents pruning
SELECT * FROM access_logs WHERE YEAR(logged_at) = 2024;
-- Good: range on partition column
SELECT * FROM access_logs
WHERE logged_at >= '2024-01-01' AND logged_at < '2025-01-01';
Adding and Dropping Partitions
-- Add next month
ALTER TABLE access_logs REORGANIZE PARTITION p_future INTO (
PARTITION p202404 VALUES LESS THAN (TO_DAYS('2024-05-01')),
PARTITION p_future VALUES LESS THAN MAXVALUE
);
-- Drop old data instantly (no DELETE scan)
ALTER TABLE access_logs DROP PARTITION p202401;
Dropping a partition is metadata-only — millions of rows disappear in seconds. This is the primary operational win for log retention.
Subpartitioning
CREATE TABLE events (
id BIGINT NOT NULL,
event_date DATE NOT NULL,
region VARCHAR(10) NOT NULL,
payload JSON,
PRIMARY KEY (id, event_date, region)
)
PARTITION BY RANGE (TO_DAYS(event_date))
SUBPARTITION BY KEY(region)
SUBPARTITIONS 4 (
PARTITION p2024q1 VALUES LESS THAN (TO_DAYS('2024-04-01')),
PARTITION p2024q2 VALUES LESS THAN (TO_DAYS('2024-07-01'))
);
Subpartitioning adds complexity — use only when both time-range pruning and regional distribution are required.
Migrating Existing Tables
MySQL cannot partition an existing table in place with a simple ALTER. Common approaches:
-- Option 1: ALTER with full rebuild (locks table)
ALTER TABLE logs PARTITION BY RANGE (TO_DAYS(created_at)) (...);
-- Option 2: Online migration via new table
CREATE TABLE logs_new (...) PARTITION BY RANGE (...);
INSERT INTO logs_new SELECT * FROM logs;
RENAME TABLE logs TO logs_old, logs_new TO logs;
DROP TABLE logs_old;
For zero-downtime migrations on large tables, use pt-online-schema-change (Percona Toolkit) or native ALGORITHM=INPLACE when supported.
Monitoring Partitions
SELECT
TABLE_NAME,
PARTITION_NAME,
TABLE_ROWS,
DATA_LENGTH,
INDEX_LENGTH
FROM information_schema.PARTITIONS
WHERE TABLE_SCHEMA = 'myapp' AND TABLE_NAME = 'access_logs'
ORDER BY PARTITION_ORDINAL_POSITION;
Watch for uneven row counts in HASH partitions (hot spots) or empty future partitions filling unexpectedly fast.
Best Practices
- Choose the partition key based on how queries filter, not just how data is inserted
- Plan partition count — too many partitions increase metadata overhead (keep under ~1000)
- Automate monthly partition creation with a cron job or event scheduler
- Include the partition column in covering indexes for common queries
- Test
EXPLAINon every critical query to confirm pruning
Common Mistakes
| Mistake | Impact |
|---|---|
| PRIMARY KEY missing partition column | CREATE TABLE fails |
| Partitioning without query filters on key | No benefit, added complexity |
| Using MAXVALUE only partition | All new data hits one partition — rebalance regularly |
| DELETE instead of DROP for retention | Hours of locks and binlog bloat |
Troubleshooting
“A PRIMARY KEY must include all columns in the table’s partitioning function” — redesign the primary key to include the partition column, or use a composite PK.
Queries still slow after partitioning — run EXPLAIN; if partitions lists all partitions, the WHERE clause does not match the partition key.
REORGANIZE fails — ensure no rows violate new boundary values before reorganizing.
Production Scenario: Audit Log Retention
A fintech platform stores seven years of audit logs in monthly RANGE partitions:
# Cron job — first day of each month
mysql -e "
ALTER TABLE audit_logs REORGANIZE PARTITION p_future INTO (
PARTITION p$(date +%Y%m) VALUES LESS THAN (TO_DAYS('$(date -d '+1 month' +%Y-%m-01)')),
PARTITION p_future VALUES LESS THAN MAXVALUE
);
ALTER TABLE audit_logs DROP PARTITION p$(date -d '-7 years' +%Y%m);
"
Compliance requires seven-year retention. Dropping monthly partitions replaces DELETE FROM audit_logs WHERE created_at < ... that previously locked the table for hours.
Performance Tips
- Smaller partitions mean smaller indexes — faster maintenance and backups per segment
- Combine partitioning with archival: detach old partitions to cold storage tables
- Avoid foreign keys on partitioned tables — MySQL restrictions make them impractical
- Use
innodb_file_per_table = ONso each partition file can be managed independently
Partitioning shines for time-series and retention workloads. For general-purpose OLTP tables under 50 million rows, proper indexing usually suffices.