MVCC

Multiversion Concurrency Control (MVCC) is a concurrency control method that maintains multiple versions of each data item, allowing transactions to read a consistent snapshot without acquiring locks. This enables readers to never block writers and writers to never block readers — a significant advantage over lock-based protocols like Two-Phase Locking (2PL).

Core Idea

Instead of overwriting data in place, each write creates a new version of the row. Each version is tagged with:

  • Creation timestamp (or transaction ID) — When this version was created
  • Expiration timestamp — When this version was superseded (or ∞ if current)

A reading transaction sees the version that was current at its snapshot timestamp — the time the transaction (or statement) began.

How Reads Work

  1. Transaction T starts and gets a snapshot timestamp (e.g., ts = 100)
  2. When T reads a row, the DBMS finds the version where creation_ts ≤ 100 < expiration_ts
  3. T sees a consistent view even if other transactions are modifying the same row concurrently

No read locks needed. No blocking.

How Writes Work

  1. Transaction T wants to update a row
  2. T creates a new version with creation_ts = T's ID
  3. The old version’s expiration_ts is set to T’s ID
  4. If another concurrent transaction already modified the same row (write-write conflict), one must abort (first-writer-wins or first-committer-wins)

Garbage Collection

Old versions that are no longer visible to any active transaction must be cleaned up. Strategies:

  • Background vacuumPostgreSQL’s VACUUM process removes dead tuples
  • Cooperative cleanup — Transactions clean up old versions as they access pages
  • Epoch-based — Versions older than the oldest active transaction’s snapshot are reclaimable

MVCC in Practice

PostgreSQL

Uses tuple versioning — each row has xmin (creating transaction) and xmax (deleting/updating transaction). Dead tuples accumulate until VACUUM runs. Supports Read Committed and Serializable Snapshot Isolation (SSI).

MySQL / InnoDB

Uses an undo log to reconstruct old versions on demand. The current version lives in the main tablespace; readers follow the undo chain to find their snapshot version. More space-efficient than PostgreSQL’s approach but undo log can grow under long-running transactions.

SQLite (WAL mode)

Readers see the database state at the start of their read transaction. Writers append to the WAL; readers ignore WAL entries newer than their snapshot.

MVCC vs Lock-Based Concurrency

AspectMVCCTwo-Phase Locking (2PL)
Readers block writersNoYes (shared locks)
Writers block readersNoYes (exclusive locks)
Write-write conflictsAbort one writerBlock then proceed
Deadlock riskLower (only write-write)Higher
Storage overheadMultiple versionsNone
Garbage collectionRequiredNot needed

Optimistic Concurrency Control (OCC)

OCC is an alternative to both locking and MVCC. It assumes conflicts are rare and lets transactions execute without any locks, validating only at commit time:

Three Phases

graph LR
    R[Read Phase] --> V[Validation Phase] --> W[Write Phase]
  1. Read Phase — Transaction executes all reads and writes against a private copy (local workspace). No locks acquired. Records both the read set (data items read) and write set (data items to modify).

  2. Validation Phase — At commit time, checks whether any transaction that committed since the read phase started modified items in this transaction’s read set. If conflict detected → abort and retry.

  3. Write Phase — If validation passes, apply the private changes to the actual database atomically.

AspectOCCMVCCTwo-Phase Locking (2PL)
Locks during executionNoneNone (for reads)S-locks and X-locks
Conflict detectionAt commitAt commit (write-write)At lock time
Deadlock riskNoneLowHigh
Best forRead-heavy, low contentionMixed workloadsWrite-heavy, high contention
Abort costHigh (redo entire transaction)ModerateLow (just wait)

Snapshot Isolation vs Serializability

MVCC naturally provides Snapshot Isolation (SI) — each transaction sees a consistent snapshot. However, SI is not the same as serializability — it permits write skew anomalies. PostgreSQL’s Serializable Snapshot Isolation (SSI) adds conflict detection on top of MVCC to achieve full serializability.

Sources