Write-Ahead Logging (WAL)

Write-Ahead Logging is a protocol that ensures database recoverability by requiring that all modifications are written to a sequential log before being applied to the actual data pages on disk. WAL is the foundation of the ARIES Recovery Algorithm and is used by virtually every production DBMS including PostgreSQL, MySQL (InnoDB), SQLite, Oracle, and SQL Server.

The WAL Rule

No data page is written to disk until all log records describing changes to that page have been flushed to stable storage.

This single invariant guarantees that after a crash, the system can reconstruct the database state by replaying or undoing log records.

How It Works

A transaction modifies data in the buffer pool (in-memory cache)
A log record describing the change is appended to the WAL buffer
Before the dirty data page can be flushed to disk, the WAL buffer must be flushed first (write-ahead guarantee)
On COMMIT, the log records up to the commit point are forced to stable storage
The actual data pages can be written to disk lazily (at checkpoint time or by the buffer manager)

Log Record Structure

Each log record typically contains:

Field	Description
LSN	Log Sequence Number — monotonically increasing identifier
Transaction ID	Which transaction generated this record
Page ID	Which data page was modified
Undo info	Previous value (for rollback)
Redo info	New value (for replay after crash)
Prev LSN	Previous log record for this transaction (forms a chain)

Recovery with WAL

After a crash, the recovery manager uses the log to:

Redo all committed transactions whose data pages may not have been flushed (ensures durability)
Undo all uncommitted transactions whose data pages may have been flushed (ensures atomicity)

This is the core of the ARIES Recovery Algorithm’s three-phase recovery (Analysis → Redo → Undo).

Checkpointing

Periodically, the DBMS writes a checkpoint record to the log and flushes all dirty pages. This bounds recovery time — only log records after the last checkpoint need to be processed.

Sharp Checkpoint — Pauses all transactions, flushes everything. Simple but causes downtime.
Fuzzy Checkpoint — Records the current state without pausing transactions. More complex but non-blocking. Used by ARIES.

SQLite WAL Mode

SQLite offers two journaling modes:

Rollback Journal (default) — The original page is copied to a journal file before modification. On crash, the journal restores original pages.
WAL Mode — Modifications go to a separate WAL file. Readers see a consistent snapshot from before the WAL. Enables concurrent reads during writes.

PRAGMA journal_mode = WAL;

WAL mode advantages in SQLite:

Readers don’t block writers
Writers don’t block readers
Faster for write-heavy workloads (sequential writes to WAL vs random writes to DB file)

MySQL / InnoDB

InnoDB uses a three-log architecture:

Log	Purpose	Structure	Used For
Redo log	WAL for crash recovery	Fixed-size circular buffer; records physical page changes	Redo phase — replay committed changes not yet flushed to data files
Undo log	Rollback + MVCC	Records old row versions (before-images)	Transaction rollback; MVCC read snapshots — readers follow undo chain to find their version
Binary log	Replication + PITR	Logical log of all DDL/DML statements	Not used for crash recovery directly; essential for primary→replica replication and point-in-time recovery

Redo Log Flow

graph LR
    TX[Transaction] -->|1. Write| RB[Redo Log Buffer]
    RB -->|2. Flush on COMMIT| RL[(Redo Log on Disk)]
    TX -->|3. Modify| BP[Buffer Pool - in memory]
    BP -->|4. Lazy flush| DF[(Data Files on Disk)]
    RL -->|Crash recovery| DF

On crash, InnoDB reads the redo log and replays any committed changes that weren’t flushed from the buffer pool to data files. The undo log then rolls back any uncommitted transactions. Long-running transactions cause the undo log to grow, which can degrade MVCC read performance — keep transactions short.

Quartz 5

Explorer

Write-Ahead Logging (WAL)

Write-Ahead Logging (WAL)

The WAL Rule

How It Works

Log Record Structure

Recovery with WAL

Checkpointing

SQLite WAL Mode

MySQL / InnoDB

Redo Log Flow

Sources

Graph View

Table of Contents

Backlinks