Redis Cluster
Why Redis Cluster?
Single-instance Redis is limited by one machine’s RAM and CPU. Cluster provides:
- Horizontal sharding across 16,384 hash slots
- Automatic failover when masters fail
- Linear scale-out — add nodes to increase capacity
- No single point of failure (with proper replica count)
Use Cluster when data exceeds one node’s RAM or write throughput exceeds one CPU core’s capacity.
Architecture
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Master 1│ │ Master 2│ │ Master 3│
│ slot │ │ slot │ │ slot │
│ 0-5460 │ │5461-10922│ │10923- │
└────┬────┘ └────┬────┘ │ 16383 │
│ │ └────┬────┘
┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│ Replica │ │ Replica │ │ Replica │
└─────────┘ └─────────┘ └─────────┘
- 16384 hash slots distributed across master nodes
- Each master should have ≥1 replica for failover
- Minimum production setup: 3 masters + 3 replicas (6 nodes)
- Minimum cluster: 3 masters (no replicas — not production-ready)
Create a Cluster (Development)
# Start 6 instances (ports 7000-7005)
for port in 7000 7001 7002 7003 7004 7005; do
mkdir -p /tmp/redis-cluster/$port
cat > /tmp/redis-cluster/$port/redis.conf <<EOF
port $port
cluster-enabled yes
cluster-config-file nodes-$port.conf
cluster-node-timeout 5000
appendonly yes
dir /tmp/redis-cluster/$port
daemonize yes
EOF
redis-server /tmp/redis-cluster/$port/redis.conf
done
# Form cluster — 3 masters, 1 replica each
redis-cli --cluster create \
127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 \
127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 \
--cluster-replicas 1
Cluster Commands
redis-cli -c -p 7000 # -c enables cluster mode (follows MOVED/ASK redirects)
CLUSTER INFO
CLUSTER NODES
CLUSTER SLOTS
CLUSTER KEYSLOT mykey
Key Hashing and Hash Tags
Keys map to slots via CRC16:
CLUSTER KEYSLOT user:1001:name
# (integer) 9189
SET user:1001:name "Alice" # routed to appropriate node
Hash tags force related keys to the same slot — required for multi-key operations:
SET user:{1001}:name "Alice"
SET user:{1001}:email "[email protected]"
SET user:{1001}:cart "[...]"
# All keys with {1001} hash to same slot
Only content inside {...} determines the hash tag slot.
Multi-Key Operations
These require all keys in the same slot:
MGET,MSET,DEL(multiple keys)SUNION,SINTER,SDIFFRENAME(source and destination)- Lua scripts touching multiple keys
- Transactions (
MULTI/EXEC)
Design key names with hash tags when atomic multi-key ops are needed.
Failover
Automatic failover when a master is unreachable:
CLUSTER FAILOVER # manual graceful failover
CLUSTER FAILOVER TAKEOVER # force immediate failover (dangerous)
redis-cli --cluster check 127.0.0.1:7000
redis-cli --cluster info 127.0.0.1:7000
Quorum: majority of masters must agree a node is down before promotion.
Adding and Removing Nodes
# Add new master
redis-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000
# Reshard slots to new node
redis-cli --cluster reshard 127.0.0.1:7000
# Follow prompts: source nodes, destination, slot count
# Remove node (drain slots first)
redis-cli --cluster reshard 127.0.0.1:7000
redis-cli --cluster del-node 127.0.0.1:7006 <node-id>
Resharding migrates slots online — plan during maintenance windows for large datasets.
Client Requirements
Clients must support cluster mode — handle MOVED and ASK redirects:
from redis.cluster import RedisCluster
rc = RedisCluster(
host="127.0.0.1",
port=7000,
decode_responses=True
)
rc.set("key", "value")
rc.get("key")
const Redis = require('ioredis');
const cluster = new Redis.Cluster([
{ host: '127.0.0.1', port: 7000 },
{ host: '127.0.0.1', port: 7001 },
]);
Non-cluster clients fail with MOVED errors.
Limitations
| Limitation | Workaround |
|---|---|
| Multi-key ops need same slot | Hash tags {id} |
| No cross-slot transactions | Redesign keys or use Lua with same slot |
| Pub/Sub broadcasts to all nodes | Sharded Pub/Sub (Redis 7+) |
SELECT database not supported |
Use key prefixes instead |
KEYS across cluster |
Scan each node individually |
Cluster vs Sentinel
| Feature | Cluster | Sentinel |
|---|---|---|
| Sharding | Yes | No |
| HA failover | Yes | Yes |
| Complexity | Higher | Lower |
| Min nodes (prod) | 6 | 3 Sentinel + 1 master + replicas |
| Max RAM | Cluster total | Single node |
Choose Cluster for scale + HA. Choose Sentinel for HA on a single large instance that fits in RAM.
Best Practices
- Run 3 masters + 3 replicas minimum in production
- Use hash tags deliberately for related multi-key data
- Deploy nodes across availability zones
- Monitor slot coverage — all 16384 slots must be assigned
- Test failover quarterly
- Use cluster-aware clients — verify library cluster support
Common Mistakes
| Mistake | Impact |
|---|---|
| 3 nodes without replicas | No failover — node death loses slot range |
| Multi-key ops without hash tags | CROSSSLOT errors |
| Non-cluster client library | MOVED errors, app failures |
| Uneven slot distribution | Hot nodes, imbalanced load |
| Resharding during peak traffic | Latency spikes |
Troubleshooting
CLUSTERDOWN errors:
CLUSTER INFO
# cluster_state:fail — not all slots covered
CLUSTER NODES
# Find failed nodes, restore or failover
CROSSSLOT errors:
# Keys in different slots — add hash tags or split operations
CLUSTER KEYSLOT key1
CLUSTER KEYSLOT key2
Uneven memory across nodes:
redis-cli --cluster check 127.0.0.1:7000
# Reshard slots from overloaded nodes
Performance Tips
- Batch operations to same slot via hash tags and pipelining
- Keep values similar size across slots for even memory use
- Set
cluster-node-timeoutappropriately (default 15s) — lower = faster failover, more false positives - Use replicas for read scaling with
READONLY+ replica reads (client support required)
Production Scenario
A social media feed service sharded 180 GB of timeline data across 6 Redis Cluster nodes (3 masters, 3 replicas) in three AZs. Hash tag {user_id} colocated user profile, feed, and follower count keys. Resharding added two masters during 6-month growth without downtime. Failover test: killing a master promoted replica in 6 seconds; client retry logic handled brief errors. Monitoring tracked per-node memory, slot distribution, and cluster_stats_messages_sent for imbalance.
Redis Cluster scales memory and throughput horizontally — design key patterns with hash tags from day one to avoid painful migrations later.