ARIES Recovery Algorithm

ARIES (Algorithm for Recovery and Isolation Exploiting Semantics) is the gold-standard recovery algorithm used by most modern DBMS implementations (IBM DB2, SQL Server, PostgreSQL’s recovery subsystem, MySQL/InnoDB). It ensures ACID properties — specifically Atomicity and Durability — across transaction failures, system crashes, and media failures.

Three Core Principles

  1. Write-Ahead Logging — All changes are recorded in the WAL before being applied to data pages. See Write-Ahead Logging (WAL).
  2. Repeating History During Redo — On recovery, ARIES replays all logged actions (including those of uncommitted transactions) to reconstruct the exact pre-crash state.
  3. Logging Changes During Undo — Undo operations are themselves logged (as Compensation Log Records / CLRs), so recovery is idempotent even if a crash occurs during recovery.

Data Structures

Log

Sequential, append-only file. Each record has a Log Sequence Number (LSN). Records include:

  • Update records — Page ID, undo data, redo data
  • Commit records — Transaction committed
  • Abort records — Transaction aborted
  • CLR (Compensation Log Record) — Generated during undo, points to the next record to undo (undoNextLSN)
  • Checkpoint records — Snapshot of active state (see below)

Transaction Table

Maintained in memory. Tracks:

  • Transaction ID
  • Status (running, committed, aborted)
  • lastLSN — Most recent log record written by this transaction

Dirty Page Table (DPT)

Tracks pages in the buffer pool that have been modified but not yet flushed to disk:

  • Page ID
  • recLSN — LSN of the first log record that dirtied this page (recovery starts here)

Three Recovery Phases

Phase 1: Analysis

Scans the log forward from the last checkpoint to reconstruct the Transaction Table and Dirty Page Table as they were at the time of the crash. Determines:

  • Which transactions were active (need undo)
  • Which pages might be dirty (need redo)
  • The starting point for the redo phase

Phase 2: Redo

Scans forward from the smallest recLSN in the Dirty Page Table. For each update log record:

  • If the page is in the DPT and the record’s LSN ≥ the page’s recLSN → redo the operation
  • This re-applies ALL changes (including uncommitted transactions) to bring the database to its exact pre-crash state

The “repeating history” principle ensures that the state is fully reconstructed before any undo begins.

Phase 3: Undo

Scans backward through the log, undoing all changes made by transactions that were active at crash time (not committed). For each undone operation:

  • Writes a CLR to the log (so the undo is itself recoverable)
  • Follows the prevLSN chain to find the previous operation of the same transaction
  • Uses undoNextLSN in CLRs to skip already-undone operations

Fuzzy Checkpointing

ARIES uses fuzzy checkpoints — the checkpoint records the current Transaction Table and Dirty Page Table without requiring all dirty pages to be flushed. This avoids blocking transactions during checkpointing.

A checkpoint record contains:

  • Active Transaction Table snapshot
  • Dirty Page Table snapshot
  • begin_checkpoint and end_checkpoint log records

Handling Nested Failures

If the system crashes during recovery, ARIES handles it gracefully:

  • CLRs from the redo phase ensure partially-completed undos aren’t repeated
  • The undoNextLSN pointer in CLRs allows the undo phase to skip already-compensated operations
  • Recovery is idempotent — running it multiple times produces the same result

Alternative Recovery Techniques

ARIES is the dominant approach, but the source material describes two alternatives:

Immediate Update

Changes are written directly to the database before the transaction commits. Requires both undo logging (to roll back uncommitted changes) and redo logging (to replay committed changes not yet flushed). More complex but provides real-time accuracy. Used in online booking systems and stock trading platforms.

Deferred Update (No-Undo/Redo)

Changes are recorded in a temporary area or log and only applied to the database at commit time. Only requires redo logging (no undo needed — uncommitted changes were never written to disk). Simpler recovery but delays visibility of changes. Used in batch processing and nightly banking updates.

Shadow Paging

A copy-on-write mechanism: the database maintains two page sets — current and shadow. Changes are made to copies of pages. On commit, the system atomically switches from shadow to current pages. On failure, the system reverts to the shadow pages. Clean but resource-intensive (duplicate pages for every modified page). Less common in high-transaction environments — ARIES with WAL is preferred.

Catastrophic (Media) Failure Recovery

For disk failures beyond what the log can handle:

  • Database Dump — Full backup to a separate storage medium
  • Fuzzy Dump — Backup taken without pausing transactions (requires log replay from dump start point)
  • Recovery: Restore dump → replay log from dump point → apply normal ARIES recovery

Backup Process Steps

  1. Suspend active transactions — Pause all running transactions
  2. Flush log — Write all in-memory log records to stable storage
  3. Flush buffers — Force-write all modified buffer pool pages to disk
  4. Copy database — Copy entire database to stable storage (tape, remote disk)
  5. Log the dump — Write a special dump record to the transaction log
  6. Resume transactions — Restart suspended transactions

Steps 1-3 and 5 mirror checkpointing. A fuzzy dump variant allows transactions to continue during the copy.

Environmental Disaster Recovery

For catastrophic events affecting the primary data center:

  • Maintain remote backup at a geographically distant site
  • Continuously synchronize via log record transfer (WAL shipping)
  • On primary site failure: secondary site replays most recent backup + shipped log records, then assumes transaction processing

Sources