to navigate

to select

to close

On this page

Caching Strategies

Caching Fundamentals

Caching stores frequently accessed data in fast memory to reduce load on slower backing stores. Redis is the most common application-level cache because of its speed, TTL support, and rich data structures.

Key metrics to track:

Metric	Target
Hit ratio	> 90% for read-heavy workloads
p99 cache latency	< 2 ms
Eviction rate	Stable, not spiking
Memory usage	< 80% of maxmemory

  INFO stats
# keyspace_hits, keyspace_misses
# hit_ratio = hits / (hits + misses)

Cache-Aside (Lazy Loading)

The application checks Redis first; on miss, reads from the database and populates the cache.

  import json
import redis

r = redis.Redis(decode_responses=True)

def get_user(user_id):
    key = f"user:{user_id}"
    cached = r.get(key)
    if cached:
        return json.loads(cached)

    user = db.query("SELECT * FROM users WHERE id = %s", user_id)
    if user:
        r.setex(key, 3600, json.dumps(user))
    return user

Pros: Simple, cache only what’s requested, survives cache failures (degraded to DB). Cons: Cache miss penalty, possible stale data until TTL expires.

Read-Through

The cache layer itself loads data on miss (often via a library or sidecar). Application always talks to cache.

  App → Cache → (miss) → Database → populate cache → return

Less common in custom apps; more typical in dedicated cache proxies.

Write-Through

Updates go to cache and database together — cache always reflects DB state.

  def update_user(user_id, data):
    db.update("users", user_id, data)
    r.setex(f"user:{user_id}", 3600, json.dumps(data))

Pros: Cache consistency. Cons: Write latency includes cache update; unused keys still cached.

Write-Behind (Write-Back)

Write to cache immediately; asynchronously flush to database.

  def update_user_async(user_id, data):
    r.setex(f"user:{user_id}", 3600, json.dumps(data))
    queue.enqueue("persist_user", user_id, data)

Pros: Fast writes. Cons: Data loss risk if cache fails before DB persist — use only when acceptable.

Cache Invalidation

“There are only two hard things in Computer Science: cache invalidation and naming things.”

  def delete_user(user_id):
    db.delete("users", user_id)
    r.delete(f"user:{user_id}")
    r.delete("users:list")           # invalidate list cache
    r.delete(f"users:count")         # invalidate aggregate cache

Invalidation Strategies

Strategy	When to Use
TTL only	Stale data acceptable (product catalog)
Delete on write	Strong consistency needed (user profile)
Versioned keys	Avoid thundering herd on bulk invalidation
Pub/Sub broadcast	Multi-instance cache invalidation

  # Versioned keys — invalidate without deleting
CACHE_VERSION = r.get("users:version") or "1"
key = f"user:{CACHE_VERSION}:{user_id}"

def invalidate_all_users():
    r.incr("users:version")   # old keys expire via TTL

TTL Strategies

  # Short TTL for frequently changing data
SETEX stock:sku:42 60 "150"

# Long TTL for static reference data
SETEX config:feature_flags 86400 "{...}"

# Jitter — avoid synchronized expiry
import random
ttl = 3600 + random.randint(0, 300)
r.setex(key, ttl, value)

Data Type	Typical TTL
Session	30 min – 24 hr
API response	1 – 5 min
User profile	15 – 60 min
Static config	Hours to days
Stock/price	10 – 60 sec

Cache Stampede Prevention

When a hot key expires, thousands of requests may hit the database simultaneously.

Lock-Based Recomputation

  import time

def get_popular_article(article_id):
    key = f"article:{article_id}"
    cached = r.get(key)
    if cached:
        return json.loads(cached)

    lock_key = f"{key}:lock"
    if r.set(lock_key, "1", nx=True, ex=10):
        try:
            data = fetch_from_db(article_id)
            r.setex(key, 300, json.dumps(data))
            return data
        finally:
            r.delete(lock_key)
    else:
        time.sleep(0.05)
        return get_popular_article(article_id)

Probabilistic Early Expiration

Refresh cache slightly before TTL expires under load:

  import random

def get_with_early_refresh(key, fetch_fn, base_ttl=300):
    cached = r.get(key)
    if cached:
        ttl = r.ttl(key)
        if ttl < 60 and random.random() < 0.1:
            fresh = fetch_fn()
            r.setex(key, base_ttl, json.dumps(fresh))
            return fresh
        return json.loads(cached)

    data = fetch_fn()
    r.setex(key, base_ttl, json.dumps(data))
    return data

Negative Caching

Cache “not found” results to protect the database from repeated lookups for missing keys:

  def get_user(user_id):
    key = f"user:{user_id}"
    cached = r.get(key)
    if cached == "__NOT_FOUND__":
        return None
    if cached:
        return json.loads(cached)

    user = db.query(...)
    if user:
        r.setex(key, 3600, json.dumps(user))
    else:
        r.setex(key, 60, "__NOT_FOUND__")   # short TTL for misses
    return user

Key Design

Use namespaced, predictable keys:

  {service}:{entity}:{id}:{attribute}
app:users:1001:profile
app:products:42:details
app:cache:homepage:v3

Best Practices

Design cache keys with namespaces for safe bulk invalidation
Add TTL jitter to prevent synchronized expiry
Monitor hit ratio — below 80% suggests wrong keys or TTLs
Cache aggregates carefully — invalidation complexity grows fast
Document which pattern (cache-aside, write-through) each entity uses

Common Mistakes

Mistake	Impact
No TTL on any keys	Memory exhaustion, stale data forever
Caching everything	Low hit ratio wastes memory
Same TTL for all keys	Stampede on synchronized expiry
Ignoring cache on write paths	Stale reads after updates
Caching errors/exceptions	Propagates failures to all users

Troubleshooting

Hit ratio suddenly drops:

  INFO stats
# Check deployment (cache flush?), TTL changes, or traffic pattern shift
MONITOR   # dev only — watch key patterns

Database load unchanged after adding cache:

  # Verify cache is actually hit — log misses in application
# Check if keys are unique per request (cache bypass)

Memory growing despite TTL:

  INFO keyspace
# Keys without expiry? SCAN for TTL=-1 keys
redis-cli --scan --pattern 'user:*' | head | xargs -I{} redis-cli TTL {}

Performance Tips

Use pipelining for bulk cache warming after deploy
Prefer hashes for object caches with field-level invalidation needs
Set maxmemory-policy allkeys-lru for pure cache workloads
Warm critical caches before traffic spikes (flash sales, launches)

Production Scenario

An e-commerce site cached product detail pages with cache-aside and 5-minute TTL plus 10% jitter. A flash sale on one SKU caused a stampede when TTL expired — database p99 spiked to 2 seconds. Adding lock-based recomputation and probabilistic early refresh reduced p99 to 45ms. Negative caching for discontinued products eliminated 30K daily DB queries for invalid SKUs.

Effective caching is measured, not assumed — track hit ratio, design invalidation explicitly, and plan for stampedes before they happen in production.

Sets, Sorted Sets, and Streams

MongoDB Aggregation Framework

Caching Strategies

Caching Fundamentals link

Cache-Aside (Lazy Loading) link

Read-Through link

Write-Through link

Write-Behind (Write-Back) link

Cache Invalidation link

Invalidation Strategies link

TTL Strategies link

Cache Stampede Prevention link

Lock-Based Recomputation link

Probabilistic Early Expiration link

Negative Caching link

Key Design link

Best Practices link

Common Mistakes link

Troubleshooting link

Performance Tips link

Production Scenario link