Caching and Buffering

Definition

Caching: Storing frequently accessed data in fast (expensive) memory for quick access.

Buffering: Temporary holding area for data in transit between devices at different speeds.

Related but distinct purposes: Caching for performance, buffering for speed matching.

Caching Concept

Purpose

Reduce disk I/O by keeping hot data in RAM:

First access to block X:
RAM Hit? No
Disk access? ~10ms

Second access to block X:
RAM Hit? Yes (if cached)
Instant access! (microseconds)

Cache hit = 1000x faster!

Cache Hit Rate

Success rate: Percentage of accesses finding data

Program accesses 1000 blocks
Cache hits: 950 accesses (fast)
Cache misses: 50 accesses (slow)

Hit rate: 95%
Average access time: 0.95 × (RAM) + 0.05 × (Disk)
                   = 0.95 × 0.1μs + 0.05 × 10ms
                   = 0.095μs + 500μs
                   = 500.1μs

Performance ≈ disk speed (despite 95% hit rate!)

Cache Replacement Policies

LRU (Least Recently Used)

Remove data not used for longest time:

Cache size: 3 blocks
Current: [A, B, C]
Access B: [A, B(recent), C]
Access D: Remove C (least recent)
          [A, B, D]

MRU (Most Recently Used)

Remove most recently used (sometimes better):

Scan reading file sequentially:
Pages: 1, 2, 3, 4, 5, ...

LRU keeps: 1, 2, 3, 4, 5 (pages 1-4 never re-accessed)
Bad! (keep old pages)

MRU removes: 5 (just used, unlikely again soon)
Keeps: 1, 2, 3, 4 (pages in front reused!)

Example:
Read columns 1-1000 (each page once)
Sequential read: Better to discard most recent!

Frequency-Based

LFU: Least Frequently Used

Page A: Accessed 100 times
Page B: Accessed 1 time

LFU keeps A (used more)
May be wrong if A now unused, B re-accessed

Block Caching

Disk Block Cache

OS caches disk blocks in RAM:

File System reads:
Block 100 from disk → Cache in RAM (page frame)
Block 200 from disk → Cache in RAM
Block 100 again → Hit in cache! (no disk access)

Cache Size

Trade-off:

Large cache:
✓ More hits
✓ Fewer disk accesses
❌ Expensive (RAM)
❌ Slower replacement decisions

Small cache:
✓ Cheap
✓ Fast replacement
❌ More misses
❌ More disk access

Typical: 10-30% of RAM used for disk cache

Double Buffering

Disk reads into Buffer 1
Process reads from Buffer 2 (previous data)
Disk reads into Buffer 2
Process reads from Buffer 1

Decouples I/O from processing
Better throughput

Without:
Disk read 100ms
Process 50ms
Disk read 100ms
Process 50ms
...
Disk limited throughput

With double buffering:
Disk + Process parallel
Throughput = max(disk, process)

Write Caching

Write-Through

Write immediately to disk:

Process writes to block:
1. Write to disk immediately
2. Acknowledge success
3. Cache update

Characteristics:
✓ Safe (data on disk)
❌ Slow (wait for disk)
❌ High latency writes

Write-Back

Write to cache, defer disk write:

Process writes to block:
1. Write to cache
2. Acknowledge success (returns immediately!)
3. Later, write to disk (asynchronous)

Characteristics:
✓ Fast (no disk wait)
❌ Risk (power loss loses data)
❌ Cache coherence issues

Strategy:
Write dirty pages when:
- Time elapsed (periodic)
- Cache full (make room)
- Explicitly requested (fsync)

Write Coalescing

Combine multiple writes:

Process writes to block 100 (mark dirty)
Process writes to block 100 again
Process writes to block 100 again

Write-through: 3 disk writes
Write-back: 1 disk write (final value)

Huge reduction!

Buffering Concept

Purpose

Smooth speed mismatches:

Input: 1KB/s (slow keyboard)
Process: 100KB/s capable
Output: 1MB/s disk

Buffer keyboard input:
Accumulate keystrokes
Every 1MB: Write to disk
Disk works at optimal speed!

Without buffer:
Disk waits for keyboard
Underutilized!

Buffer Size

Trade-off:

Large buffer:
✓ Accommodates bursts
✓ Smooth throughput
❌ Latency (wait for buffer fill)
❌ Memory usage

Small buffer:
✓ Low latency
✓ Saves memory
❌ Can overflow if producer faster

Examples

Example 1: Disk Read Caching

File system reads 1000 blocks sequentially:
Blocks: 100, 101, 102, ..., 1099

Without cache:
1000 disk accesses (10 seconds @ 10ms per access)

With read-ahead cache:
OS predicts: "After block 100, user wants 101"
Prefetch block 101 while reading 100
When read completes: 101 already in cache!

Cache size: 32 blocks
Hits: 31/32 (97%)
Disk accesses: ~32 (instead of 1000!)
Time: 0.3 seconds (33x faster!)

Example 2: Database Buffer Pool

Database engine memory:
50% for caching frequently accessed pages
50% for working memory

Query: "SELECT * FROM users WHERE age > 30"

First run:
Scans table from disk (100 page faults)
Pages cached (most users in memory)

Second run:
Same query
95% of pages hit in cache
Only 5 disk faults
10x faster!

Optimization:
Identify hot pages (frequently accessed)
Keep in buffer indefinitely
Swap cold pages to disk

Example 3: Write-Back Optimization

Database transaction:
500 writes to different blocks

Write-through:
500 disk writes (5 seconds)
Transaction slow

Write-back:
500 writes to cache (instant!)
Background: Coalesce into 50 disk I/Os (0.5 seconds)
Transaction appears instant (responds to client)
Disk I/O batched for efficiency

Page Replacement in Cache

When cache full, evict to make room:

Cache (3 blocks): [A, B, C]

New request: Block D

Replacement:
Find victim (using LRU/LFU/etc)
If dirty: Write to disk first
Remove from cache
Load Block D

Must handle dirty pages (write-back)

Page Cache in Modern Systems

Linux Page Cache

Unified cache for:
- File I/O
- Memory-mapped files
- Buffer I/O

Automatically managed:
LRU approximation (clock algorithm)
Transparent to applications

Windows File Cache

Similar to Linux:
Automatic management
Write-back by default (dirty pages flushed periodically)
Configurable

Buffering vs Caching

FeatureBufferingCaching
PurposeSpeed matchingPerformance
DurationTransientPersistent (until evicted)
LifetimeUntil transmittedWhile memory permits
DataIn-flight dataPreviously accessed data
ExampleKeyboard→Process queueDisk blocks in RAM
OverheadLowCan be high

Cache Coherence

Consistency Problem

Multiple copies of data:

Block X in:
- Disk (original)
- Disk cache (RAM)
- Application buffer
- Multiple process caches

Process A modifies X in cache
Process B reads X from disk cache
Sees old data!
Inconsistency!

Solutions

1. Write-Through: Always write to disk immediately Other processes see fresh data

2. Cache Invalidation: When write occurs, invalidate other copies Force re-read from disk

3. Shared Cache: Single cache for all processes Natural coherence

Performance Optimization

Cache Locality

Organize data to improve hits:

Access pattern:
for (int i = 0; i < n; i++) {
    process(array[i])  // Sequential = good locality
}

vs

for (i in random_order) {
    process(array[i])  // Random = poor locality
}

Prefetching

Load predicted pages before needed:

Reading file A, B, C sequentially
Predict C needed next
Prefetch C while B processing
C ready when B finishes
No wait!

Summary

Caching stores frequently accessed data in fast memory (RAM) for performance. Hit rates critical - 95% hits still gives disk-like performance. Write-back faster but riskier than write-through. Buffering smooths speed mismatches between devices. Cache replacement policies (LRU, LFU) determine eviction. Modern systems use page caches and buffers transparently. Understanding caching/buffering essential for performance optimization. Trade-offs always exist: size vs latency, safety vs speed.