Why Partition?

Partitioning splits one logical table into multiple physical segments. MySQL can prune partitions — skip entire segments — when queries filter on the partition key. This improves query speed, simplifies archival (drop old partitions), and helps manage tables that exceed billions of rows.

Partitioning is not a substitute for indexes. It helps when queries consistently filter on the partition column (usually date or region).

Partition Types

Type Partition Key Use Case
RANGE Numeric or date ranges Time-series logs, monthly data
LIST Explicit value lists Regional data (US, EU, APAC)
HASH Hash of column Even distribution without range queries
KEY MySQL internal hash Similar to HASH, supports multi-column

RANGE Partitioning by Date

  CREATE TABLE access_logs (
    id BIGINT NOT NULL AUTO_INCREMENT,
    logged_at DATETIME NOT NULL,
    user_id INT,
    path VARCHAR(512),
    status_code SMALLINT,
    PRIMARY KEY (id, logged_at)
)
PARTITION BY RANGE (TO_DAYS(logged_at)) (
    PARTITION p202401 VALUES LESS THAN (TO_DAYS('2024-02-01')),
    PARTITION p202402 VALUES LESS THAN (TO_DAYS('2024-03-01')),
    PARTITION p202403 VALUES LESS THAN (TO_DAYS('2024-04-01')),
    PARTITION p_future VALUES LESS THAN MAXVALUE
);
  

Critical rule: every unique index (including PRIMARY KEY) must include all columns in the partition expression.

LIST Partitioning

  CREATE TABLE sales (
    id INT NOT NULL,
    region ENUM('US', 'EU', 'APAC') NOT NULL,
    amount DECIMAL(10,2),
    PRIMARY KEY (id, region)
)
PARTITION BY LIST COLUMNS(region) (
    PARTITION p_us VALUES IN ('US'),
    PARTITION p_eu VALUES IN ('EU'),
    PARTITION p_apac VALUES IN ('APAC')
);
  

HASH Partitioning

  CREATE TABLE sessions (
    session_id CHAR(64) NOT NULL,
    user_id INT,
    data JSON,
    PRIMARY KEY (session_id)
)
PARTITION BY HASH(session_id)
PARTITIONS 8;
  

HASH spreads rows evenly but does not support efficient range queries on session_id. Use when access is always by exact key.

Partition Pruning

  EXPLAIN SELECT * FROM access_logs
WHERE logged_at >= '2024-02-01' AND logged_at < '2024-03-01';
  

Look for partitions: p202402 in EXPLAIN — only one partition is scanned. Without a filter on logged_at, all partitions are scanned (same cost as an unpartitioned table).

  -- Bad: function on partition column prevents pruning
SELECT * FROM access_logs WHERE YEAR(logged_at) = 2024;

-- Good: range on partition column
SELECT * FROM access_logs
WHERE logged_at >= '2024-01-01' AND logged_at < '2025-01-01';
  

Adding and Dropping Partitions

  -- Add next month
ALTER TABLE access_logs REORGANIZE PARTITION p_future INTO (
    PARTITION p202404 VALUES LESS THAN (TO_DAYS('2024-05-01')),
    PARTITION p_future VALUES LESS THAN MAXVALUE
);

-- Drop old data instantly (no DELETE scan)
ALTER TABLE access_logs DROP PARTITION p202401;
  

Dropping a partition is metadata-only — millions of rows disappear in seconds. This is the primary operational win for log retention.

Subpartitioning

  CREATE TABLE events (
    id BIGINT NOT NULL,
    event_date DATE NOT NULL,
    region VARCHAR(10) NOT NULL,
    payload JSON,
    PRIMARY KEY (id, event_date, region)
)
PARTITION BY RANGE (TO_DAYS(event_date))
SUBPARTITION BY KEY(region)
SUBPARTITIONS 4 (
    PARTITION p2024q1 VALUES LESS THAN (TO_DAYS('2024-04-01')),
    PARTITION p2024q2 VALUES LESS THAN (TO_DAYS('2024-07-01'))
);
  

Subpartitioning adds complexity — use only when both time-range pruning and regional distribution are required.

Migrating Existing Tables

MySQL cannot partition an existing table in place with a simple ALTER. Common approaches:

  -- Option 1: ALTER with full rebuild (locks table)
ALTER TABLE logs PARTITION BY RANGE (TO_DAYS(created_at)) (...);

-- Option 2: Online migration via new table
CREATE TABLE logs_new (...) PARTITION BY RANGE (...);
INSERT INTO logs_new SELECT * FROM logs;
RENAME TABLE logs TO logs_old, logs_new TO logs;
DROP TABLE logs_old;
  

For zero-downtime migrations on large tables, use pt-online-schema-change (Percona Toolkit) or native ALGORITHM=INPLACE when supported.

Monitoring Partitions

  SELECT
    TABLE_NAME,
    PARTITION_NAME,
    TABLE_ROWS,
    DATA_LENGTH,
    INDEX_LENGTH
FROM information_schema.PARTITIONS
WHERE TABLE_SCHEMA = 'myapp' AND TABLE_NAME = 'access_logs'
ORDER BY PARTITION_ORDINAL_POSITION;
  

Watch for uneven row counts in HASH partitions (hot spots) or empty future partitions filling unexpectedly fast.

Best Practices

  1. Choose the partition key based on how queries filter, not just how data is inserted
  2. Plan partition count — too many partitions increase metadata overhead (keep under ~1000)
  3. Automate monthly partition creation with a cron job or event scheduler
  4. Include the partition column in covering indexes for common queries
  5. Test EXPLAIN on every critical query to confirm pruning

Common Mistakes

Mistake Impact
PRIMARY KEY missing partition column CREATE TABLE fails
Partitioning without query filters on key No benefit, added complexity
Using MAXVALUE only partition All new data hits one partition — rebalance regularly
DELETE instead of DROP for retention Hours of locks and binlog bloat

Troubleshooting

“A PRIMARY KEY must include all columns in the table’s partitioning function” — redesign the primary key to include the partition column, or use a composite PK.

Queries still slow after partitioning — run EXPLAIN; if partitions lists all partitions, the WHERE clause does not match the partition key.

REORGANIZE fails — ensure no rows violate new boundary values before reorganizing.

Production Scenario: Audit Log Retention

A fintech platform stores seven years of audit logs in monthly RANGE partitions:

  # Cron job — first day of each month
mysql -e "
ALTER TABLE audit_logs REORGANIZE PARTITION p_future INTO (
    PARTITION p$(date +%Y%m) VALUES LESS THAN (TO_DAYS('$(date -d '+1 month' +%Y-%m-01)')),
    PARTITION p_future VALUES LESS THAN MAXVALUE
);
ALTER TABLE audit_logs DROP PARTITION p$(date -d '-7 years' +%Y%m);
"
  

Compliance requires seven-year retention. Dropping monthly partitions replaces DELETE FROM audit_logs WHERE created_at < ... that previously locked the table for hours.

Performance Tips

  • Smaller partitions mean smaller indexes — faster maintenance and backups per segment
  • Combine partitioning with archival: detach old partitions to cold storage tables
  • Avoid foreign keys on partitioned tables — MySQL restrictions make them impractical
  • Use innodb_file_per_table = ON so each partition file can be managed independently

Partitioning shines for time-series and retention workloads. For general-purpose OLTP tables under 50 million rows, proper indexing usually suffices.