MVCC
Multiversion Concurrency Control (MVCC) is a concurrency control method that maintains multiple versions of each data item, allowing transactions to read a consistent snapshot without acquiring locks. This enables readers to never block writers and writers to never block readers — a significant advantage over lock-based protocols like Two-Phase Locking (2PL).
Core Idea
Instead of overwriting data in place, each write creates a new version of the row. Each version is tagged with:
- Creation timestamp (or transaction ID) — When this version was created
- Expiration timestamp — When this version was superseded (or ∞ if current)
A reading transaction sees the version that was current at its snapshot timestamp — the time the transaction (or statement) began.
How Reads Work
- Transaction T starts and gets a snapshot timestamp (e.g.,
ts = 100) - When T reads a row, the DBMS finds the version where
creation_ts ≤ 100 < expiration_ts - T sees a consistent view even if other transactions are modifying the same row concurrently
No read locks needed. No blocking.
How Writes Work
- Transaction T wants to update a row
- T creates a new version with
creation_ts = T's ID - The old version’s
expiration_tsis set to T’s ID - If another concurrent transaction already modified the same row (write-write conflict), one must abort (first-writer-wins or first-committer-wins)
Garbage Collection
Old versions that are no longer visible to any active transaction must be cleaned up. Strategies:
- Background vacuum — PostgreSQL’s VACUUM process removes dead tuples
- Cooperative cleanup — Transactions clean up old versions as they access pages
- Epoch-based — Versions older than the oldest active transaction’s snapshot are reclaimable
MVCC in Practice
PostgreSQL
Uses tuple versioning — each row has xmin (creating transaction) and xmax (deleting/updating transaction). Dead tuples accumulate until VACUUM runs. Supports Read Committed and Serializable Snapshot Isolation (SSI).
MySQL / InnoDB
Uses an undo log to reconstruct old versions on demand. The current version lives in the main tablespace; readers follow the undo chain to find their snapshot version. More space-efficient than PostgreSQL’s approach but undo log can grow under long-running transactions.
SQLite (WAL mode)
Readers see the database state at the start of their read transaction. Writers append to the WAL; readers ignore WAL entries newer than their snapshot.
MVCC vs Lock-Based Concurrency
| Aspect | MVCC | Two-Phase Locking (2PL) |
|---|---|---|
| Readers block writers | No | Yes (shared locks) |
| Writers block readers | No | Yes (exclusive locks) |
| Write-write conflicts | Abort one writer | Block then proceed |
| Deadlock risk | Lower (only write-write) | Higher |
| Storage overhead | Multiple versions | None |
| Garbage collection | Required | Not needed |
Optimistic Concurrency Control (OCC)
OCC is an alternative to both locking and MVCC. It assumes conflicts are rare and lets transactions execute without any locks, validating only at commit time:
Three Phases
graph LR R[Read Phase] --> V[Validation Phase] --> W[Write Phase]
-
Read Phase — Transaction executes all reads and writes against a private copy (local workspace). No locks acquired. Records both the read set (data items read) and write set (data items to modify).
-
Validation Phase — At commit time, checks whether any transaction that committed since the read phase started modified items in this transaction’s read set. If conflict detected → abort and retry.
-
Write Phase — If validation passes, apply the private changes to the actual database atomically.
| Aspect | OCC | MVCC | Two-Phase Locking (2PL) |
|---|---|---|---|
| Locks during execution | None | None (for reads) | S-locks and X-locks |
| Conflict detection | At commit | At commit (write-write) | At lock time |
| Deadlock risk | None | Low | High |
| Best for | Read-heavy, low contention | Mixed workloads | Write-heavy, high contention |
| Abort cost | High (redo entire transaction) | Moderate | Low (just wait) |
Snapshot Isolation vs Serializability
MVCC naturally provides Snapshot Isolation (SI) — each transaction sees a consistent snapshot. However, SI is not the same as serializability — it permits write skew anomalies. PostgreSQL’s Serializable Snapshot Isolation (SSI) adds conflict detection on top of MVCC to achieve full serializability.