Serializability
Serializability is the correctness criterion for concurrent transaction execution. A schedule (interleaving of operations from multiple transactions) is serializable if its effect is equivalent to some serial execution — as if the transactions ran one at a time, in some order.
Types
Conflict Serializability
Two operations conflict when they come from different transactions, target the same data item, and at least one is a write. The three conflict types:
| Conflict | Pattern | Example |
|---|---|---|
| Write-Write (WW) | w₁(X), w₂(X) | Lost update |
| Write-Read (WR) | w₁(X), r₂(X) | Dirty read |
| Read-Write (RW) | r₁(X), w₂(X) | Non-repeatable read |
Two schedules are conflict equivalent if one can be transformed into the other by swapping only non-conflicting adjacent operations.
Precedence Graph Test
Build a directed graph G = (N, E) where each transaction is a node:
- For each Write-Read conflict w_i(Q) before r_j(Q): add edge T_i → T_j
- For each Read-Write conflict r_i(Q) before w_j(Q): add edge T_i → T_j
- For each Write-Write conflict w_i(Q) before w_j(Q): add edge T_i → T_j
- If the graph is acyclic → conflict serializable. A topological sort gives a valid serial order.
- If a cycle exists → NOT conflict serializable.
View Serializability
A broader criterion than conflict serializability. Schedule S is view equivalent to S’ if:
- Same set of transactions with identical operations
- Consistent initial reads — If T_i reads the initial value of Q in S, it does so in S’ too
- Read-after-write order — If T_i reads Q after T_j wrote Q in S, the same holds in S’
- Matching final writes — If T_i performs the last write on Q in S, it does so in S’
Under the constrained write assumption (every write is derived from a prior read), every conflict serializable schedule is also view serializable, but not vice versa. Testing view serializability is NP-complete, so conflict serializability is used in practice.
Schedule Notation
Schedules are written in shorthand: r (read), w (write), c (commit), a (abort), with transaction ID as subscript:
S1: r₁(A); w₁(A); r₁(B); w₁(B); c₁; r₂(B); w₂(B); r₂(C); w₂(C); c₂;
A serial schedule executes one transaction at a time (n! possible orderings for n transactions). Serial schedules are always correct but inefficient (CPU idle during I/O).
Recoverability
Not all serializable schedules are safe from cascading failures:
| Schedule Type | Rule | Why It Matters |
|---|---|---|
| Recoverable | If T_j reads data written by T_i, T_i must commit before T_j commits | Prevents committing based on uncommitted data |
| Cascadeless | T_j only reads data already committed by T_i | Eliminates cascading rollbacks entirely |
| Strict | T_j cannot read OR write data written by T_i until T_i commits | Simplest recovery — matches Strict 2PL behavior |
Cascadeless ⊂ Strict ⊂ Recoverable. Most practical systems enforce cascadeless or strict schedules.
Ensuring Serializability
| Mechanism | How it ensures serializability |
|---|---|
| Two-Phase Locking (2PL) | Lock protocol prevents conflicting interleavings |
| MVCC + SSI | Snapshot Isolation + conflict detection (PostgreSQL Serializable) |
| Timestamp ordering | Transactions execute in timestamp order |
| Optimistic concurrency | Validate at commit time; abort if conflicts detected |
SQL Isolation Levels and Serializability
Only the Serializable isolation level guarantees serializability. Lower levels trade correctness for performance:
- Read Committed — Not serializable (allows non-repeatable reads)
- Repeatable Read — Not serializable (allows write skew in standard SQL)
- Serializable — Full serializability guaranteed