Caching and Buffering

Definition

Caching: Storing frequently accessed data in fast (expensive) memory for quick access.

Buffering: Temporary holding area for data in transit between devices at different speeds.

Related but distinct purposes: Caching for performance, buffering for speed matching.

Caching Concept

Purpose

Reduce disk I/O by keeping hot data in RAM:

First access to block X:
RAM Hit? No
Disk access? ~10ms

Second access to block X:
RAM Hit? Yes (if cached)
Instant access! (microseconds)

Cache hit = 1000x faster!

Cache Hit Rate

Success rate: Percentage of accesses finding data

Program accesses 1000 blocks
Cache hits: 950 accesses (fast)
Cache misses: 50 accesses (slow)

Hit rate: 95%
Average access time: 0.95 × (RAM) + 0.05 × (Disk)
                   = 0.95 × 0.1μs + 0.05 × 10ms
                   = 0.095μs + 500μs
                   = 500.1μs

Performance ≈ disk speed (despite 95% hit rate!)

Cache Replacement Policies

LRU (Least Recently Used)

Remove data not used for longest time:

Cache size: 3 blocks
Current: [A, B, C]
Access B: [A, B(recent), C]
Access D: Remove C (least recent)
          [A, B, D]

MRU (Most Recently Used)

Remove most recently used (sometimes better):

Scan reading file sequentially:
Pages: 1, 2, 3, 4, 5, ...

LRU keeps: 1, 2, 3, 4, 5 (pages 1-4 never re-accessed)
Bad! (keep old pages)

MRU removes: 5 (just used, unlikely again soon)
Keeps: 1, 2, 3, 4 (pages in front reused!)

Example:
Read columns 1-1000 (each page once)
Sequential read: Better to discard most recent!

Frequency-Based

LFU: Least Frequently Used

Page A: Accessed 100 times
Page B: Accessed 1 time

LFU keeps A (used more)
May be wrong if A now unused, B re-accessed

Block Caching

Disk Block Cache

OS caches disk blocks in RAM:

File System reads:
Block 100 from disk → Cache in RAM (page frame)
Block 200 from disk → Cache in RAM
Block 100 again → Hit in cache! (no disk access)

Cache Size

Trade-off:

Large cache:
✓ More hits
✓ Fewer disk accesses
❌ Expensive (RAM)
❌ Slower replacement decisions

Small cache:
✓ Cheap
✓ Fast replacement
❌ More misses
❌ More disk access

Typical: 10-30% of RAM used for disk cache

Double Buffering

Disk reads into Buffer 1
Process reads from Buffer 2 (previous data)
Disk reads into Buffer 2
Process reads from Buffer 1

Decouples I/O from processing
Better throughput

Without:
Disk read 100ms
Process 50ms
Disk read 100ms
Process 50ms
...
Disk limited throughput

With double buffering:
Disk + Process parallel
Throughput = max(disk, process)

Write Caching

Write-Through

Write immediately to disk:

Process writes to block:
1. Write to disk immediately
2. Acknowledge success
3. Cache update

Characteristics:
✓ Safe (data on disk)
❌ Slow (wait for disk)
❌ High latency writes

Write-Back

Write to cache, defer disk write:

Process writes to block:
1. Write to cache
2. Acknowledge success (returns immediately!)
3. Later, write to disk (asynchronous)

Characteristics:
✓ Fast (no disk wait)
❌ Risk (power loss loses data)
❌ Cache coherence issues

Strategy:
Write dirty pages when:
- Time elapsed (periodic)
- Cache full (make room)
- Explicitly requested (fsync)

Write Coalescing

Combine multiple writes:

Process writes to block 100 (mark dirty)
Process writes to block 100 again
Process writes to block 100 again

Write-through: 3 disk writes
Write-back: 1 disk write (final value)

Huge reduction!

Buffering Concept

Purpose

Smooth speed mismatches:

Input: 1KB/s (slow keyboard)
Process: 100KB/s capable
Output: 1MB/s disk

Buffer keyboard input:
Accumulate keystrokes
Every 1MB: Write to disk
Disk works at optimal speed!

Without buffer:
Disk waits for keyboard
Underutilized!

Buffer Size

Trade-off:

Large buffer:
✓ Accommodates bursts
✓ Smooth throughput
❌ Latency (wait for buffer fill)
❌ Memory usage

Small buffer:
✓ Low latency
✓ Saves memory
❌ Can overflow if producer faster

Examples

Example 1: Disk Read Caching

File system reads 1000 blocks sequentially:
Blocks: 100, 101, 102, ..., 1099

Without cache:
1000 disk accesses (10 seconds @ 10ms per access)

With read-ahead cache:
OS predicts: "After block 100, user wants 101"
Prefetch block 101 while reading 100
When read completes: 101 already in cache!

Cache size: 32 blocks
Hits: 31/32 (97%)
Disk accesses: ~32 (instead of 1000!)
Time: 0.3 seconds (33x faster!)

Example 2: Database Buffer Pool

Database engine memory:
50% for caching frequently accessed pages
50% for working memory

Query: "SELECT * FROM users WHERE age > 30"

First run:
Scans table from disk (100 page faults)
Pages cached (most users in memory)

Second run:
Same query
95% of pages hit in cache
Only 5 disk faults
10x faster!

Optimization:
Identify hot pages (frequently accessed)
Keep in buffer indefinitely
Swap cold pages to disk

Example 3: Write-Back Optimization

Database transaction:
500 writes to different blocks

Write-through:
500 disk writes (5 seconds)
Transaction slow

Write-back:
500 writes to cache (instant!)
Background: Coalesce into 50 disk I/Os (0.5 seconds)
Transaction appears instant (responds to client)
Disk I/O batched for efficiency

Page Replacement in Cache

When cache full, evict to make room:

Cache (3 blocks): [A, B, C]

New request: Block D

Replacement:
Find victim (using LRU/LFU/etc)
If dirty: Write to disk first
Remove from cache
Load Block D

Must handle dirty pages (write-back)

Page Cache in Modern Systems

Linux Page Cache

Unified cache for:
- File I/O
- Memory-mapped files
- Buffer I/O

Automatically managed:
LRU approximation (clock algorithm)
Transparent to applications

Windows File Cache

Similar to Linux:
Automatic management
Write-back by default (dirty pages flushed periodically)
Configurable

Buffering vs Caching

Feature	Buffering	Caching
Purpose	Speed matching	Performance
Duration	Transient	Persistent (until evicted)
Lifetime	Until transmitted	While memory permits
Data	In-flight data	Previously accessed data
Example	Keyboard→Process queue	Disk blocks in RAM
Overhead	Low	Can be high

Cache Coherence

Consistency Problem

Multiple copies of data:

Block X in:
- Disk (original)
- Disk cache (RAM)
- Application buffer
- Multiple process caches

Process A modifies X in cache
Process B reads X from disk cache
Sees old data!
Inconsistency!

Solutions

1. Write-Through: Always write to disk immediately Other processes see fresh data

2. Cache Invalidation: When write occurs, invalidate other copies Force re-read from disk

3. Shared Cache: Single cache for all processes Natural coherence

Performance Optimization

Cache Locality

Organize data to improve hits:

Access pattern:
for (int i = 0; i < n; i++) {
    process(array[i])  // Sequential = good locality
}

vs

for (i in random_order) {
    process(array[i])  // Random = poor locality
}

Prefetching

Load predicted pages before needed:

Reading file A, B, C sequentially
Predict C needed next
Prefetch C while B processing
C ready when B finishes
No wait!

Summary

Caching stores frequently accessed data in fast memory (RAM) for performance. Hit rates critical - 95% hits still gives disk-like performance. Write-back faster but riskier than write-through. Buffering smooths speed mismatches between devices. Cache replacement policies (LRU, LFU) determine eviction. Modern systems use page caches and buffers transparently. Understanding caching/buffering essential for performance optimization. Trade-offs always exist: size vs latency, safety vs speed.

Caching and Buffering

Definition

Caching Concept

Purpose

Cache Hit Rate

Cache Replacement Policies

LRU (Least Recently Used)

MRU (Most Recently Used)

Frequency-Based

Block Caching

Disk Block Cache

Cache Size

Double Buffering

Write Caching

Write-Through

Write-Back

Write Coalescing

Buffering Concept

Purpose

Buffer Size

Examples

Example 1: Disk Read Caching

Example 2: Database Buffer Pool

Example 3: Write-Back Optimization

Page Replacement in Cache

Page Cache in Modern Systems

Linux Page Cache

Windows File Cache

Buffering vs Caching

Cache Coherence

Consistency Problem

Solutions

Performance Optimization

Cache Locality

Prefetching

Summary

Install Notes App