Caching Strategies
Caching Fundamentals
Caching stores frequently accessed data in fast memory to reduce load on slower backing stores. Redis is the most common application-level cache because of its speed, TTL support, and rich data structures.
Key metrics to track:
| Metric | Target |
|---|---|
| Hit ratio | > 90% for read-heavy workloads |
| p99 cache latency | < 2 ms |
| Eviction rate | Stable, not spiking |
| Memory usage | < 80% of maxmemory |
INFO stats
# keyspace_hits, keyspace_misses
# hit_ratio = hits / (hits + misses)
Cache-Aside (Lazy Loading)
The application checks Redis first; on miss, reads from the database and populates the cache.
import json
import redis
r = redis.Redis(decode_responses=True)
def get_user(user_id):
key = f"user:{user_id}"
cached = r.get(key)
if cached:
return json.loads(cached)
user = db.query("SELECT * FROM users WHERE id = %s", user_id)
if user:
r.setex(key, 3600, json.dumps(user))
return user
Pros: Simple, cache only what’s requested, survives cache failures (degraded to DB). Cons: Cache miss penalty, possible stale data until TTL expires.
Read-Through
The cache layer itself loads data on miss (often via a library or sidecar). Application always talks to cache.
App → Cache → (miss) → Database → populate cache → return
Less common in custom apps; more typical in dedicated cache proxies.
Write-Through
Updates go to cache and database together — cache always reflects DB state.
def update_user(user_id, data):
db.update("users", user_id, data)
r.setex(f"user:{user_id}", 3600, json.dumps(data))
Pros: Cache consistency. Cons: Write latency includes cache update; unused keys still cached.
Write-Behind (Write-Back)
Write to cache immediately; asynchronously flush to database.
def update_user_async(user_id, data):
r.setex(f"user:{user_id}", 3600, json.dumps(data))
queue.enqueue("persist_user", user_id, data)
Pros: Fast writes. Cons: Data loss risk if cache fails before DB persist — use only when acceptable.
Cache Invalidation
“There are only two hard things in Computer Science: cache invalidation and naming things.”
def delete_user(user_id):
db.delete("users", user_id)
r.delete(f"user:{user_id}")
r.delete("users:list") # invalidate list cache
r.delete(f"users:count") # invalidate aggregate cache
Invalidation Strategies
| Strategy | When to Use |
|---|---|
| TTL only | Stale data acceptable (product catalog) |
| Delete on write | Strong consistency needed (user profile) |
| Versioned keys | Avoid thundering herd on bulk invalidation |
| Pub/Sub broadcast | Multi-instance cache invalidation |
# Versioned keys — invalidate without deleting
CACHE_VERSION = r.get("users:version") or "1"
key = f"user:{CACHE_VERSION}:{user_id}"
def invalidate_all_users():
r.incr("users:version") # old keys expire via TTL
TTL Strategies
# Short TTL for frequently changing data
SETEX stock:sku:42 60 "150"
# Long TTL for static reference data
SETEX config:feature_flags 86400 "{...}"
# Jitter — avoid synchronized expiry
import random
ttl = 3600 + random.randint(0, 300)
r.setex(key, ttl, value)
| Data Type | Typical TTL |
|---|---|
| Session | 30 min – 24 hr |
| API response | 1 – 5 min |
| User profile | 15 – 60 min |
| Static config | Hours to days |
| Stock/price | 10 – 60 sec |
Cache Stampede Prevention
When a hot key expires, thousands of requests may hit the database simultaneously.
Lock-Based Recomputation
import time
def get_popular_article(article_id):
key = f"article:{article_id}"
cached = r.get(key)
if cached:
return json.loads(cached)
lock_key = f"{key}:lock"
if r.set(lock_key, "1", nx=True, ex=10):
try:
data = fetch_from_db(article_id)
r.setex(key, 300, json.dumps(data))
return data
finally:
r.delete(lock_key)
else:
time.sleep(0.05)
return get_popular_article(article_id)
Probabilistic Early Expiration
Refresh cache slightly before TTL expires under load:
import random
def get_with_early_refresh(key, fetch_fn, base_ttl=300):
cached = r.get(key)
if cached:
ttl = r.ttl(key)
if ttl < 60 and random.random() < 0.1:
fresh = fetch_fn()
r.setex(key, base_ttl, json.dumps(fresh))
return fresh
return json.loads(cached)
data = fetch_fn()
r.setex(key, base_ttl, json.dumps(data))
return data
Negative Caching
Cache “not found” results to protect the database from repeated lookups for missing keys:
def get_user(user_id):
key = f"user:{user_id}"
cached = r.get(key)
if cached == "__NOT_FOUND__":
return None
if cached:
return json.loads(cached)
user = db.query(...)
if user:
r.setex(key, 3600, json.dumps(user))
else:
r.setex(key, 60, "__NOT_FOUND__") # short TTL for misses
return user
Key Design
Use namespaced, predictable keys:
{service}:{entity}:{id}:{attribute}
app:users:1001:profile
app:products:42:details
app:cache:homepage:v3
Best Practices
- Design cache keys with namespaces for safe bulk invalidation
- Add TTL jitter to prevent synchronized expiry
- Monitor hit ratio — below 80% suggests wrong keys or TTLs
- Cache aggregates carefully — invalidation complexity grows fast
- Document which pattern (cache-aside, write-through) each entity uses
Common Mistakes
| Mistake | Impact |
|---|---|
| No TTL on any keys | Memory exhaustion, stale data forever |
| Caching everything | Low hit ratio wastes memory |
| Same TTL for all keys | Stampede on synchronized expiry |
| Ignoring cache on write paths | Stale reads after updates |
| Caching errors/exceptions | Propagates failures to all users |
Troubleshooting
Hit ratio suddenly drops:
INFO stats
# Check deployment (cache flush?), TTL changes, or traffic pattern shift
MONITOR # dev only — watch key patterns
Database load unchanged after adding cache:
# Verify cache is actually hit — log misses in application
# Check if keys are unique per request (cache bypass)
Memory growing despite TTL:
INFO keyspace
# Keys without expiry? SCAN for TTL=-1 keys
redis-cli --scan --pattern 'user:*' | head | xargs -I{} redis-cli TTL {}
Performance Tips
- Use pipelining for bulk cache warming after deploy
- Prefer hashes for object caches with field-level invalidation needs
- Set
maxmemory-policy allkeys-lrufor pure cache workloads - Warm critical caches before traffic spikes (flash sales, launches)
Production Scenario
An e-commerce site cached product detail pages with cache-aside and 5-minute TTL plus 10% jitter. A flash sale on one SKU caused a stampede when TTL expired — database p99 spiked to 2 seconds. Adding lock-based recomputation and probabilistic early refresh reduced p99 to 45ms. Negative caching for discontinued products eliminated 30K daily DB queries for invalid SKUs.
Effective caching is measured, not assumed — track hit ratio, design invalidation explicitly, and plan for stampedes before they happen in production.