MongoDB Sharding Deep Dive
Sharding distributes data across multiple replica sets (shards) for horizontal scale. Getting the shard key wrong is the most expensive mistake in MongoDB — resharding is disruptive and slow. This guide covers advanced sharding operations.
When to Shard
Shard when a single replica set cannot meet requirements:
| Signal | Threshold |
|---|---|
| Working set exceeds RAM | Hot data + indexes > available memory |
| Write throughput | Sustained writes exceeding single primary capacity |
| Storage | Approaching single-node disk limits |
| I/O saturation | Disk or network bottleneck on primary |
Do not shard prematurely — a well-indexed replica set handles terabytes and thousands of ops/sec.
Sharded Cluster Components
┌──────────────┐
│ mongos │ ← clients connect here
│ (query router)│
└──────┬───────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Shard A │ │ Shard B │ │ Shard C │
│ (repl set) │ │ (repl set) │ │ (repl set) │
└────────────┘ └────────────┘ └────────────┘
│ │ │
└───────────────┼───────────────┘
▼
┌──────────────┐
│Config Servers│ ← metadata (repl set of 3)
└──────────────┘
Shard Key Selection
The shard key is immutable for a collection (pre-resharding). Choose carefully.
Good Shard Key Properties
- High cardinality — many distinct values
- Even distribution — writes spread across shards
- Query isolation — common queries target single shard
Shard Key Strategies
Hashed — even write distribution
sh.shardCollection("analytics.events", { userId: "hashed" })
// Pros: even distribution
// Cons: range queries scatter to all shards
Compound — range queries + distribution
sh.shardCollection("analytics.events", { userId: 1, timestamp: 1 })
// Pros: user-scoped queries hit one shard
// Cons: hot users create hotspots
Hashed compound (MongoDB 5.0+)
sh.shardCollection("analytics.events", { userId: "hashed", timestamp: 1 })
Anti-Patterns
| Bad Key | Problem |
|---|---|
{ _id: 1 } (default) |
Monotonic ObjectIds → all writes to last chunk |
{ createdAt: 1 } |
Time-based hotspot on latest shard |
{ status: 1 } |
Low cardinality — 3 values, 3 chunks max |
{ isActive: 1 } |
Boolean — two shards at best |
Enabling Sharding
// Connect to mongos
mongosh --host mongos:27017
// Add shards (each is a replica set)
sh.addShard("rs-shard1/mongo1:27017,mongo2:27017,mongo3:27017")
sh.addShard("rs-shard2/mongo4:27017,mongo5:27017,mongo6:27017")
// Enable sharding on database
sh.enableSharding("myapp")
// Shard collections
sh.shardCollection("myapp.users", { userId: "hashed" })
sh.shardCollection("myapp.orders", { userId: 1, orderDate: 1 })
sh.status()
Chunk Management
Data is divided into chunks (default 128 MB range per shard key):
// View chunks for a collection
use config
db.chunks.find({ ns: "myapp.orders" }).pretty()
// Split a chunk manually (rare — balancer handles this)
sh.splitAt("myapp.orders", { userId: ObjectId("..."), orderDate: ISODate("2024-06-01") })
// Move a chunk to specific shard
sh.moveChunk("myapp.orders", { userId: MinKey }, "rs-shard2")
Chunk Migration
The balancer migrates chunks between shards for even distribution:
sh.getBalancerState() // true = enabled
sh.isBalancerRunning()
sh.setBalancerState(false) // disable during maintenance
// Configure migration window (off-peak only)
use config
db.settings.updateOne(
{ _id: "balancer" },
{ $set: { activeWindow: { start: "01:00", stop: "05:00" } } },
{ upsert: true }
)
Excessive migrations indicate poor shard key distribution.
Zone Sharding
Pin data ranges to specific shards for compliance and locality:
// Define zones
sh.addShardToZone("rs-shard-us", "US")
sh.addShardToZone("rs-shard-eu", "EU")
// Assign ranges
sh.updateZoneKeyRange(
"myapp.users",
{ country: MinKey, country: MaxKey },
"US" // default zone
)
sh.updateZoneKeyRange(
"myapp.users",
{ country: "DE", country: "DE" },
"EU"
)
sh.updateZoneKeyRange(
"myapp.users",
{ country: "FR", country: "FR" },
"EU"
)
Pre-split chunks in each zone before loading data to avoid migration storms.
Query Routing
// Targeted query — single shard
db.orders.find({ userId: ObjectId("..."), orderDate: { $gte: date } })
// Scatter-gather — all shards
db.orders.find({ status: "pending" })
// Check which shards a query hits
db.orders.find({ userId: ObjectId("...") }).explain("executionStats")
// shardedDataDistribution shows targeted vs scatter
Design queries to include the shard key prefix for targeted routing.
Resharding (MongoDB 4.4+)
Change shard key without dump/restore:
db.adminCommand({
reshardingCollection: "myapp.orders",
key: { orderId: "hashed" }
})
Resharding:
- Runs in background but consumes I/O
- Requires free disk space (~2x collection size)
- Brief performance impact during cutover
- Plan during maintenance window
Hotspot Diagnosis
Symptoms: One shard at 90% CPU while others idle.
// Check per-shard stats
db.orders.getShardDistribution()
// Check chunk distribution
use config
db.chunks.aggregate([
{ $match: { ns: "myapp.orders" } },
{ $group: { _id: "$shard", count: { $sum: 1 } } }
])
// Current operations per shard
db.adminCommand({ currentOp: 1, $all: true })
Fixes:
- Reshard with better key (hashed vs range)
- Pre-split chunks for known hot ranges
- Application-level sharding (separate collections per tenant)
Production Scenario: Multi-Tenant SaaS
// Option 1: Tenant ID as prefix of shard key
sh.shardCollection("myapp.data", { tenantId: 1, createdAt: 1 })
// Option 2: Database per tenant (small tenants)
// myapp_tenant_abc, myapp_tenant_xyz — no cross-tenant queries
// Option 3: Hashed tenantId for even distribution
sh.shardCollection("myapp.data", { tenantId: "hashed" })
For large tenants, consider dedicated shards via zone sharding.
Config Server Operations
Config servers store cluster metadata — protect them:
// Config server replica set (always 3 members)
rs.initiate({
_id: "configReplSet",
configsvr: true,
members: [
{ _id: 0, host: "cfg1:27019" },
{ _id: 1, host: "cfg2:27019" },
{ _id: 2, host: "cfg3:27019" }
]
})
Never run production sharded clusters with fewer than 3 config servers.
Common Mistakes
- Sharding before exhausting replica set capacity
- Monotonic shard key causing write hotspots
- Queries without shard key prefix — scatter-gather on every request
- Disabling balancer permanently — uneven data distribution
- Not pre-splitting chunks before bulk load — migration storm
- Resharding without disk space planning
Troubleshooting
Balancer not running
sh.getBalancerState()
// Check config server connectivity
// Check for active migrations blocking balancer
db.getSiblingDB("config").locks.find()
Chunk too large (> maxChunkSize)
// Force split
sh.splitFind("myapp.orders", { userId: ObjectId("...") })
Jumbo chunk (cannot migrate)
// Identify jumbo chunks
db.getSiblingDB("config").chunks.find({ ns: "myapp.orders", jumbo: true })
// Split manually or redesign shard key
Best Practices
- Choose shard key before loading production data
- Include shard key in all high-frequency queries
- Pre-split chunks for bulk imports
- Schedule balancer during off-peak hours
- Monitor per-shard metrics independently
- Test with production-scale data in staging
- Plan resharding strategy before going live
What Comes Next
Production operations covers backup, monitoring, upgrades, and incident response for sharded and replica set deployments.