ARIES Recovery Algorithm
ARIES (Algorithm for Recovery and Isolation Exploiting Semantics) is the gold-standard recovery algorithm used by most modern DBMS implementations (IBM DB2, SQL Server, PostgreSQL’s recovery subsystem, MySQL/InnoDB). It ensures ACID properties — specifically Atomicity and Durability — across transaction failures, system crashes, and media failures.
Three Core Principles
- Write-Ahead Logging — All changes are recorded in the WAL before being applied to data pages. See Write-Ahead Logging (WAL).
- Repeating History During Redo — On recovery, ARIES replays all logged actions (including those of uncommitted transactions) to reconstruct the exact pre-crash state.
- Logging Changes During Undo — Undo operations are themselves logged (as Compensation Log Records / CLRs), so recovery is idempotent even if a crash occurs during recovery.
Data Structures
Log
Sequential, append-only file. Each record has a Log Sequence Number (LSN). Records include:
- Update records — Page ID, undo data, redo data
- Commit records — Transaction committed
- Abort records — Transaction aborted
- CLR (Compensation Log Record) — Generated during undo, points to the next record to undo (
undoNextLSN) - Checkpoint records — Snapshot of active state (see below)
Transaction Table
Maintained in memory. Tracks:
- Transaction ID
- Status (running, committed, aborted)
- lastLSN — Most recent log record written by this transaction
Dirty Page Table (DPT)
Tracks pages in the buffer pool that have been modified but not yet flushed to disk:
- Page ID
- recLSN — LSN of the first log record that dirtied this page (recovery starts here)
Three Recovery Phases
Phase 1: Analysis
Scans the log forward from the last checkpoint to reconstruct the Transaction Table and Dirty Page Table as they were at the time of the crash. Determines:
- Which transactions were active (need undo)
- Which pages might be dirty (need redo)
- The starting point for the redo phase
Phase 2: Redo
Scans forward from the smallest recLSN in the Dirty Page Table. For each update log record:
- If the page is in the DPT and the record’s LSN ≥ the page’s recLSN → redo the operation
- This re-applies ALL changes (including uncommitted transactions) to bring the database to its exact pre-crash state
The “repeating history” principle ensures that the state is fully reconstructed before any undo begins.
Phase 3: Undo
Scans backward through the log, undoing all changes made by transactions that were active at crash time (not committed). For each undone operation:
- Writes a CLR to the log (so the undo is itself recoverable)
- Follows the
prevLSNchain to find the previous operation of the same transaction - Uses
undoNextLSNin CLRs to skip already-undone operations
Fuzzy Checkpointing
ARIES uses fuzzy checkpoints — the checkpoint records the current Transaction Table and Dirty Page Table without requiring all dirty pages to be flushed. This avoids blocking transactions during checkpointing.
A checkpoint record contains:
- Active Transaction Table snapshot
- Dirty Page Table snapshot
begin_checkpointandend_checkpointlog records
Handling Nested Failures
If the system crashes during recovery, ARIES handles it gracefully:
- CLRs from the redo phase ensure partially-completed undos aren’t repeated
- The
undoNextLSNpointer in CLRs allows the undo phase to skip already-compensated operations - Recovery is idempotent — running it multiple times produces the same result
Alternative Recovery Techniques
ARIES is the dominant approach, but the source material describes two alternatives:
Immediate Update
Changes are written directly to the database before the transaction commits. Requires both undo logging (to roll back uncommitted changes) and redo logging (to replay committed changes not yet flushed). More complex but provides real-time accuracy. Used in online booking systems and stock trading platforms.
Deferred Update (No-Undo/Redo)
Changes are recorded in a temporary area or log and only applied to the database at commit time. Only requires redo logging (no undo needed — uncommitted changes were never written to disk). Simpler recovery but delays visibility of changes. Used in batch processing and nightly banking updates.
Shadow Paging
A copy-on-write mechanism: the database maintains two page sets — current and shadow. Changes are made to copies of pages. On commit, the system atomically switches from shadow to current pages. On failure, the system reverts to the shadow pages. Clean but resource-intensive (duplicate pages for every modified page). Less common in high-transaction environments — ARIES with WAL is preferred.
Catastrophic (Media) Failure Recovery
For disk failures beyond what the log can handle:
- Database Dump — Full backup to a separate storage medium
- Fuzzy Dump — Backup taken without pausing transactions (requires log replay from dump start point)
- Recovery: Restore dump → replay log from dump point → apply normal ARIES recovery
Backup Process Steps
- Suspend active transactions — Pause all running transactions
- Flush log — Write all in-memory log records to stable storage
- Flush buffers — Force-write all modified buffer pool pages to disk
- Copy database — Copy entire database to stable storage (tape, remote disk)
- Log the dump — Write a special dump record to the transaction log
- Resume transactions — Restart suspended transactions
Steps 1-3 and 5 mirror checkpointing. A fuzzy dump variant allows transactions to continue during the copy.
Environmental Disaster Recovery
For catastrophic events affecting the primary data center:
- Maintain remote backup at a geographically distant site
- Continuously synchronize via log record transfer (WAL shipping)
- On primary site failure: secondary site replays most recent backup + shipped log records, then assumes transaction processing