Database Concurrency

Concurrency Control in Database Management Systems (DBMS) is very important to facilitate multi user interactions and enable multiple processes access and modify the database simultaneously without causing data inconsistencies, integrity violations and other issues.

Concurrency Control in Databases

  • Simultaneous Access - In a multi user database, multiple users or processes can attempt to access and modify the database simultaneously, without proper control this can result in data corruption and inconsistency issues.
  • Concurrency Problems: Several issues can arise when multiple transactions (sequences of database operations) run concurrently
    • Lost Updates: When two transactions try to update the data, one transaction can overwrite the other, resulting in data loss.
    • Dirty Reads: One transaction can read the uncommitted changes by a different transaction, and can result in inconsistent data if the transaction was rolled back.
    • Uncommitted Data: A transaction can read the data that is being modified by a different transaction but has not committed yet, resulting in incorrect data retrieval
    • Inconsistent Retrieval: Inconsistent data can be read if one transaction reads some data before another transaction updates it, leading to a view of the database that doesn’t reflect any single consistent state.
  • Concurrency Control Techniques
    • Locking Transactions lock the data they want to access or modify. Locks prevent other transactions from accessing the same data until the lock is released. This ensures that only one transaction can modify a piece of data at a time.
    • Isolation Levels DBMS offers different isolation levels (e.g., Read Uncommitted, Read Committed, Repeatable Read, Serializable) that define the degree of isolation between concurrent transactions. Each level balances data consistency with performance and concurrency.
    • Two-Phase Locking (2PL)
    • Timestamp Ordering : Each Transactions is assigned a unique timestamp to preserve the the order of transactions, the older transactions get the higher priority to avoid conflicts with the newer transactions
    • Conflict Resolution When conflicts between transactions occur, the Database Management Systems (DBMS) employs conflict resolution techniques such as waiting, aborting or rolling back a transaction to ensure data integrity.
  • Transaction Isolation The choice of isolation level determines how transactions interact with each other, the higher levels of isolation is provide stronger guarantees but can affect system performance due to increase in locking and blocking.
  • Commit and Rollback Transactions in the Database Management Systems (DBMS) can be committed (making their changes permanent) or rolled back (undoing the changes). Concurrency control ensures that transactions can be safely committed without conflicting with other transactions.

Conflict Types

Every concurrency anomaly stems from one of three fundamental conflict patterns between operations from different transactions on the same data item:

ConflictPatternAnomaly Caused
Write-Write (WW)T₁ writes X, T₂ writes XLost Update — one write overwrites the other
Write-Read (WR)T₁ writes X, T₂ reads XDirty Read — T₂ reads uncommitted data from T₁
Read-Write (RW)T₁ reads X, T₂ writes XNon-Repeatable Read — T₁ re-reads and gets a different value

Phantom reads are a special case: T₁ executes a range query, T₂ inserts a new row matching the range, T₁ re-executes and finds new rows. Prevented only at the Serializable isolation level.

Understanding these conflicts is the foundation for Serializability theory — see the precedence graph test for determining if an interleaved schedule is safe.

Concurrency Control Deep Dives

For detailed treatment of each technique, see:

Deadlocks

A Deadlock occurs when two or more transactions each hold a lock that the other needs. The DBMS detects cycles in the wait-for graph and aborts one transaction (the victim) to break the deadlock. Prevention strategies include:

  • Wait-Die — Older transactions wait; younger ones abort and retry
  • Wound-Wait — Older transactions force younger ones to abort; younger ones wait
  • Timeouts — Abort transactions that wait too long

In short, concurrency control in DBMS is essential for managing simultaneous access to a database to maintain data consistency and integrity. It involves techniques like locking, isolation levels, and conflict resolution to prevent issues like lost updates and dirty reads when multiple transactions operate concurrently. The choice of concurrency control method should consider the specific requirements of the application and balance data consistency with system performance.