Definition
Caching: Storing frequently accessed data in fast (expensive) memory for quick access.
Buffering: Temporary holding area for data in transit between devices at different speeds.
Related but distinct purposes: Caching for performance, buffering for speed matching.
Caching Concept
Purpose
Reduce disk I/O by keeping hot data in RAM:
First access to block X:
RAM Hit? No
Disk access? ~10ms
Second access to block X:
RAM Hit? Yes (if cached)
Instant access! (microseconds)
Cache hit = 1000x faster!
Cache Hit Rate
Success rate: Percentage of accesses finding data
Program accesses 1000 blocks
Cache hits: 950 accesses (fast)
Cache misses: 50 accesses (slow)
Hit rate: 95%
Average access time: 0.95 × (RAM) + 0.05 × (Disk)
= 0.95 × 0.1μs + 0.05 × 10ms
= 0.095μs + 500μs
= 500.1μs
Performance ≈ disk speed (despite 95% hit rate!)
Cache Replacement Policies
LRU (Least Recently Used)
Remove data not used for longest time:
Cache size: 3 blocks
Current: [A, B, C]
Access B: [A, B(recent), C]
Access D: Remove C (least recent)
[A, B, D]
MRU (Most Recently Used)
Remove most recently used (sometimes better):
Scan reading file sequentially:
Pages: 1, 2, 3, 4, 5, ...
LRU keeps: 1, 2, 3, 4, 5 (pages 1-4 never re-accessed)
Bad! (keep old pages)
MRU removes: 5 (just used, unlikely again soon)
Keeps: 1, 2, 3, 4 (pages in front reused!)
Example:
Read columns 1-1000 (each page once)
Sequential read: Better to discard most recent!
Frequency-Based
LFU: Least Frequently Used
Page A: Accessed 100 times
Page B: Accessed 1 time
LFU keeps A (used more)
May be wrong if A now unused, B re-accessed
Block Caching
Disk Block Cache
OS caches disk blocks in RAM:
File System reads:
Block 100 from disk → Cache in RAM (page frame)
Block 200 from disk → Cache in RAM
Block 100 again → Hit in cache! (no disk access)
Cache Size
Trade-off:
Large cache:
✓ More hits
✓ Fewer disk accesses
❌ Expensive (RAM)
❌ Slower replacement decisions
Small cache:
✓ Cheap
✓ Fast replacement
❌ More misses
❌ More disk access
Typical: 10-30% of RAM used for disk cache
Double Buffering
Disk reads into Buffer 1
Process reads from Buffer 2 (previous data)
Disk reads into Buffer 2
Process reads from Buffer 1
Decouples I/O from processing
Better throughput
Without:
Disk read 100ms
Process 50ms
Disk read 100ms
Process 50ms
...
Disk limited throughput
With double buffering:
Disk + Process parallel
Throughput = max(disk, process)
Write Caching
Write-Through
Write immediately to disk:
Process writes to block:
1. Write to disk immediately
2. Acknowledge success
3. Cache update
Characteristics:
✓ Safe (data on disk)
❌ Slow (wait for disk)
❌ High latency writes
Write-Back
Write to cache, defer disk write:
Process writes to block:
1. Write to cache
2. Acknowledge success (returns immediately!)
3. Later, write to disk (asynchronous)
Characteristics:
✓ Fast (no disk wait)
❌ Risk (power loss loses data)
❌ Cache coherence issues
Strategy:
Write dirty pages when:
- Time elapsed (periodic)
- Cache full (make room)
- Explicitly requested (fsync)
Write Coalescing
Combine multiple writes:
Process writes to block 100 (mark dirty)
Process writes to block 100 again
Process writes to block 100 again
Write-through: 3 disk writes
Write-back: 1 disk write (final value)
Huge reduction!
Buffering Concept
Purpose
Smooth speed mismatches:
Input: 1KB/s (slow keyboard)
Process: 100KB/s capable
Output: 1MB/s disk
Buffer keyboard input:
Accumulate keystrokes
Every 1MB: Write to disk
Disk works at optimal speed!
Without buffer:
Disk waits for keyboard
Underutilized!
Buffer Size
Trade-off:
Large buffer:
✓ Accommodates bursts
✓ Smooth throughput
❌ Latency (wait for buffer fill)
❌ Memory usage
Small buffer:
✓ Low latency
✓ Saves memory
❌ Can overflow if producer faster
Examples
Example 1: Disk Read Caching
File system reads 1000 blocks sequentially:
Blocks: 100, 101, 102, ..., 1099
Without cache:
1000 disk accesses (10 seconds @ 10ms per access)
With read-ahead cache:
OS predicts: "After block 100, user wants 101"
Prefetch block 101 while reading 100
When read completes: 101 already in cache!
Cache size: 32 blocks
Hits: 31/32 (97%)
Disk accesses: ~32 (instead of 1000!)
Time: 0.3 seconds (33x faster!)
Example 2: Database Buffer Pool
Database engine memory:
50% for caching frequently accessed pages
50% for working memory
Query: "SELECT * FROM users WHERE age > 30"
First run:
Scans table from disk (100 page faults)
Pages cached (most users in memory)
Second run:
Same query
95% of pages hit in cache
Only 5 disk faults
10x faster!
Optimization:
Identify hot pages (frequently accessed)
Keep in buffer indefinitely
Swap cold pages to disk
Example 3: Write-Back Optimization
Database transaction:
500 writes to different blocks
Write-through:
500 disk writes (5 seconds)
Transaction slow
Write-back:
500 writes to cache (instant!)
Background: Coalesce into 50 disk I/Os (0.5 seconds)
Transaction appears instant (responds to client)
Disk I/O batched for efficiency
Page Replacement in Cache
When cache full, evict to make room:
Cache (3 blocks): [A, B, C]
New request: Block D
Replacement:
Find victim (using LRU/LFU/etc)
If dirty: Write to disk first
Remove from cache
Load Block D
Must handle dirty pages (write-back)
Page Cache in Modern Systems
Linux Page Cache
Unified cache for:
- File I/O
- Memory-mapped files
- Buffer I/O
Automatically managed:
LRU approximation (clock algorithm)
Transparent to applications
Windows File Cache
Similar to Linux:
Automatic management
Write-back by default (dirty pages flushed periodically)
Configurable
Buffering vs Caching
| Feature | Buffering | Caching |
|---|---|---|
| Purpose | Speed matching | Performance |
| Duration | Transient | Persistent (until evicted) |
| Lifetime | Until transmitted | While memory permits |
| Data | In-flight data | Previously accessed data |
| Example | Keyboard→Process queue | Disk blocks in RAM |
| Overhead | Low | Can be high |
Cache Coherence
Consistency Problem
Multiple copies of data:
Block X in:
- Disk (original)
- Disk cache (RAM)
- Application buffer
- Multiple process caches
Process A modifies X in cache
Process B reads X from disk cache
Sees old data!
Inconsistency!
Solutions
1. Write-Through: Always write to disk immediately Other processes see fresh data
2. Cache Invalidation: When write occurs, invalidate other copies Force re-read from disk
3. Shared Cache: Single cache for all processes Natural coherence
Performance Optimization
Cache Locality
Organize data to improve hits:
Access pattern:
for (int i = 0; i < n; i++) {
process(array[i]) // Sequential = good locality
}
vs
for (i in random_order) {
process(array[i]) // Random = poor locality
}
Prefetching
Load predicted pages before needed:
Reading file A, B, C sequentially
Predict C needed next
Prefetch C while B processing
C ready when B finishes
No wait!
Summary
Caching stores frequently accessed data in fast memory (RAM) for performance. Hit rates critical - 95% hits still gives disk-like performance. Write-back faster but riskier than write-through. Buffering smooths speed mismatches between devices. Cache replacement policies (LRU, LFU) determine eviction. Modern systems use page caches and buffers transparently. Understanding caching/buffering essential for performance optimization. Trade-offs always exist: size vs latency, safety vs speed.