⚛conceptBeginnerPerformance
Caching
Definition
A technique for storing frequently accessed data in fast-access memory to reduce latency and database load.
What is Caching?
Caching is a fundamental performance optimization technique that stores copies of frequently accessed data in fast-access storage layers. Instead of repeatedly querying a database or recomputing results, applications can serve data from cache, dramatically reducing latency and system load.
Why Caching Matters
In modern distributed systems, caching is essential for:
- Reducing Latency: Serve data from memory (nanoseconds) instead of disk (milliseconds)
- Lowering Database Load: Prevent excessive queries that can overwhelm databases
- Improving Scalability: Handle more concurrent users without proportional infrastructure costs
- Enhancing Resilience: Serve stale data if primary data source is temporarily unavailable
Common Caching Strategies
- Cache-Aside: Application checks cache first, then database on miss
- Write-Through: Write to cache and database simultaneously
- Write-Behind: Write to cache immediately, database asynchronously
- Read-Through: Cache automatically loads from database on miss
When to Use Caching
Cache when:
- Data is read frequently but updated infrequently
- Computation is expensive
- Database queries are slow
- You need to reduce backend load
Avoid caching when:
- Data changes constantly
- Cache invalidation is complex
- Memory is limited
- Data must always be fresh
Implementation Considerations
Cache Eviction Policies:
- LRU (Least Recently Used)
- LFU (Least Frequently Used)
- TTL (Time-To-Live)
Cache Layers:
- Browser cache
- CDN cache
- Application cache (Redis, Memcached)
- Database query cache
Real-World Examples
- Amazon: Product catalog caching reduces database load by 90%
- Netflix: Content metadata cached at CDN edge for instant loading
- Uber: Route calculations cached to handle millions of concurrent requests
Caching is not about complexity—it's about thoughtful placement of data closer to where it's needed.