Hello dear students! Welcome to this complete lesson on Concurrency Control Techniques. This is Chapter 4 of your Advanced Database course, and I promise to walk you through every single concept step by step. Ready? Let’s go!
1. What is Concurrency Control?
Imagine a bank. Two customers walk in at the exact same time. Both want to withdraw money from the same joint account. If the bank allows both to read the balance at the same time, both might see 10,000 Birr, and both might withdraw 8,000 Birr. The result? The account goes negative, and nobody notices until it’s too late.
This is exactly the kind of problem that concurrency control solves. In a database system, concurrency control techniques are methods used to ensure the isolation property of transactions that execute at the same time. When multiple transactions run concurrently, we need to make sure they don’t interfere with each other in ways that produce wrong results.
Most of these techniques ensure serializability of schedules by using protocols. A serializable schedule is one that produces the same result as if the transactions had run one after another, in some order. The main techniques we study are based on the concept of locking data items.
Q1. What is the main purpose of concurrency control techniques in a DBMS?
Q2. True or False: A serializable schedule means transactions literally execute one at a time.
Q3. Why is the isolation property important for concurrent transactions?
A1. Concurrency control techniques ensure that multiple transactions executing simultaneously do not interfere with each other, maintaining the isolation property and guaranteeing serializability of the resulting schedule.
A2. False. A serializable schedule does NOT mean transactions execute one at a time. It means the result is equivalent to some serial execution. The transactions can interleave, but the final outcome must be the same as if they ran sequentially.
A3. The isolation property ensures that each transaction executes as if it were the only transaction running. Without isolation, one transaction could read intermediate (uncommitted) data from another, leading to inconsistent results such as dirty reads, lost updates, or phantom reads.
2. Understanding Locks
Now, what exactly is a lock? Think of it like a bathroom door. When someone goes inside, they lock the door. Nobody else can enter until the person inside unlocks and comes out. In a database, a lock is a variable associated with a data item that describes the status of that item β whether someone is using it or not.
There is one lock for each data item in the database. Locks are used to synchronize access by concurrent transactions. When a transaction wants to use a data item, it must first acquire the lock. The lock guarantees exclusive use of the data item to the current transaction. No other transaction can access that item while it is locked.
The transaction acquires the lock before data access and releases it (unlocks) when the transaction is completed. This approach assumes that conflicts between transactions are likely to occur, so we lock things proactively. This is why it is called pessimistic locking β we assume the worst case and prepare for it.
3. Lock Granularity β At What Level Do We Lock?
Lock granularity refers to the level or size of the data item that we lock. We can lock at different levels. Let me walk you through each level from the largest to the smallest.
3.1 Database Level Locking
The entire database is locked. While transaction T1 is running, no other transaction can access any table in the entire database. This is extremely restrictive.
Even if T2 wants to use a completely different table, it must wait. This makes data access very slow and is completely unsuitable for multiuser DBMSs.
3.2 Table Level Locking
Here, only the entire table is locked. T1 locks Table A. T2 can still access Table B (in the same database) but cannot touch Table A. This is better than database-level locking but still causes traffic jams when many transactions want the same table.
3.3 Page Level Locking
A disk page (also called a disk block) is a directly addressable section of a disk. A table can span several pages, and one page can contain several rows. The DBMS locks an entire page. Two transactions can access the same table as long as they work on different pages.
Page-level locks are currently the most frequently used locking method in multiuser DBMSs.
3.4 Row Level Locking
Only a single row is locked. Multiple transactions can access different rows of the same table, even if those rows are on the same page. This is much less restrictive and improves data availability.
However, the management overhead is very high because a separate lock exists for each row. Modern DBMSs use a smart trick: if a transaction requests many locks on the same page, the system automatically upgrades from row-level locking to page-level locking to reduce overhead.
3.5 Field (Attribute) Level Locking
The most granular level. Two transactions can access the same row as long as they use different fields (columns). For example, T1 updates the “Name” field while T2 updates the “Salary” field of the same row.
This sounds great in theory, but it is rarely implemented because it requires extremely high computer overhead. In practice, row-level locking is much more useful.
Database level β Most restrictive, worst concurrency, lowest overhead
Table level β Very restrictive, causes traffic jams
Page level β Most commonly used in practice, good balance
Row level β Less restrictive, high overhead, good availability
Field level β Least restrictive, highest overhead, rarely used
Trade-off: As granularity gets finer (smaller locks), concurrency improves but overhead increases.
Q1 (MCQ). Which lock granularity level is the most frequently used in multiuser DBMSs?
a) Database level
b) Table level
c) Page level
d) Field level
Q2 (MCQ). Why is field-level locking rarely implemented in practice?
a) It does not improve concurrency
b) It requires extremely high computer overhead
c) It causes more deadlocks than row-level locking
d) It cannot prevent the lost update problem
Q3. A student says: “Database-level locking is the safest because it locks everything.” Explain why this is wrong for a multiuser system.
A1. c) Page level. Page-level locking is currently the most frequently used multiuser DBMS locking method because it offers a good balance between concurrency and overhead.
A2. b) It requires extremely high computer overhead. While field-level locking provides the most flexible access, managing a separate lock for every single attribute in every row creates enormous overhead that outweighs the benefits.
A3. While database-level locking does prevent all conflicts, it defeats the purpose of a multiuser system. If the entire database is locked, only ONE transaction can run at a time. This makes the system effectively single-user, causing very slow data access. The whole point of a multiuser DBMS is to allow concurrent access β so database-level locking is counterproductive.
4. Types of Locks: Binary Locks
Now let’s look at what types of locks the DBMS can use. The first type is the simplest one.
4.1 Binary Locks
A binary lock has only two states: locked (1) or unlocked (0). That’s it. Simple.
Here’s how it works: Every database operation requires the affected object to be locked. If the object is locked (LOCK(X) = 1), the requesting transaction must wait. If the object is unlocked (LOCK(X) = 0), the lock is set to 1 and the transaction proceeds. After the transaction finishes, it issues an unlock_item(X) operation to set LOCK(X) back to 0.
Two operations are defined:
Worked Example: Transaction T1 wants to read and write data item X.
- T1 issues lock_item(X)
- System checks: LOCK(X) = 0 (unlocked) β sets LOCK(X) = 1
- T1 reads X, performs computation, writes new value to X
- T1 issues unlock_item(X) β LOCK(X) = 0, next waiting transaction (if any) is woken up
Binary locks eliminate the lost update problem because the lock is not released until the WRITE is complete. However, binary locks are now considered too restrictive β they don’t even allow multiple transactions to simply READ the same data. So they are not used much in practice.
5. Shared/Exclusive Locks (Read/Write Locks)
This is a much more practical approach. Instead of just two states, we have three states: read-locked (shared), write-locked (exclusive), or unlocked.
5.1 Shared Lock (S-lock / Read Lock)
A shared lock is issued when a transaction wants to read data. Multiple transactions can hold a shared lock on the same data item simultaneously. Why? Because reading doesn’t change data, so there is no conflict. Think of it like multiple people reading the same notice board at the same time β no problem at all.
5.2 Exclusive Lock (X-lock / Write Lock)
An exclusive lock is issued when a transaction wants to write (update) data. Only ONE transaction can hold an exclusive lock on a data item. No other transaction β not even readers β can access the item while an exclusive lock exists. Think of it like someone erasing and rewriting the notice board β nobody else should be reading or writing at that moment.
5.3 Lock Operations for Shared/Exclusive Locks
Notice something important: we keep a counter called no_of_reads(X). This tracks how many transactions are currently holding a shared lock on X. The item stays read-locked as long as at least one transaction is reading it. It only becomes unlocked when the counter drops to zero.
Worked Example: Three transactions T1, T2, T3 all want to read item X, then T4 wants to write X.
- T1 issues read_lock(X) β LOCK(X) = “read-locked”, no_of_reads = 1
- T2 issues read_lock(X) β already read-locked, so no_of_reads = 2
- T3 issues read_lock(X) β no_of_reads = 3
- All three read X simultaneously β no conflict
- T4 issues write_lock(X) β LOCK(X) is “read-locked”, so T4 waits
- T1, T2, T3 each issue unlock(X). After the third unlock, no_of_reads = 0, LOCK(X) becomes “unlocked”
- T4 is woken up, gets write lock, proceeds to write X
5.4 Lock Conversion
Sometimes a transaction first reads data and then decides to update it. This means it needs to convert its shared lock to an exclusive lock. This is called upgrading.
Conversely, if a transaction has finished writing but still needs to keep reading, it can convert from exclusive to shared. This is called downgrading.
Upgrading: Shared β Exclusive. The transaction must be the only one holding the read lock, or it must wait until all others release their shared locks.
Downgrading: Exclusive β Shared. This is always safe and can be done immediately.
Q1 (MCQ). Under which condition can a deadlock NEVER occur?
a) When two transactions hold shared locks on the same item
b) When one transaction holds a shared lock and another requests an exclusive lock on the same item
c) When two transactions each hold an exclusive lock on different items and each requests the other’s item
d) When a transaction tries to upgrade its shared lock while another transaction holds a shared lock on the same item
Q2 (MCQ). In shared/exclusive locking, what happens when no_of_reads(X) reaches 0 during unlock?
a) The item is permanently deleted
b) LOCK(X) is set to “write-locked” automatically
c) LOCK(X) is set to “unlocked” and one waiting transaction is woken up
d) The lock manager aborts all waiting transactions
Q3. Explain the difference between upgrading and downgrading a lock. Why is upgrading more restrictive?
Q4 (Fill in the blank). A shared lock allows ________ transactions to read the same data item simultaneously, while an exclusive lock allows only ________ transaction to access the item.
A1. a) When two transactions hold shared locks on the same item. Shared locks are compatible with each other. Since no transaction is requesting an exclusive lock, there is no circular wait condition, and deadlock cannot occur. Options b, c, and d all involve exclusive lock requests, which can lead to deadlocks.
A2. c) LOCK(X) is set to “unlocked” and one waiting transaction is woken up. When the read counter drops to zero, it means no transaction is currently reading the item. The lock is released, and the lock manager wakes up one transaction that was waiting for the lock.
A3. Upgrading converts a shared lock to an exclusive lock (read β write). It is restrictive because the transaction must be the only holder of the shared lock. If other transactions are also reading, the upgrade request must wait, which can cause deadlock. Downgrading converts an exclusive lock to a shared lock (write β read). This is safe because adding readers doesn’t conflict β other transactions can immediately start reading.
A4. multiple; one
6. Two-Phase Locking Protocol (2PL)
Here comes one of the most important topics in this chapter. The Two-Phase Locking Protocol (2PL) is a fundamental protocol that guarantees serializability.
The rule is simple but powerful: All locking operations must precede the first unlock operation in a transaction. That’s the entire rule. If every transaction follows this, the schedule is guaranteed to be serializable.
The Two Phases
In the growing phase, the transaction keeps acquiring locks. It cannot release any lock during this phase. At some point, it has acquired all the locks it needs β this point is called the lock point. After the lock point, the shrinking phase begins. In this phase, the transaction releases locks but cannot acquire any new lock.
Worked Example:
Q1 (MCQ). Which of the following transactions follows the Two-Phase Locking Protocol?
a) lock(X); read(X); unlock(X); lock(Y); write(Y); unlock(Y); commit
b) lock(X); lock(Y); read(X); write(Y); unlock(X); unlock(Y); commit
c) lock(X); read(X); lock(Y); unlock(X); write(Y); unlock(Y); commit
d) unlock(X); lock(X); read(X); lock(Y); write(Y); unlock(Y); commit
Q2 (True/False). If all transactions in a schedule follow 2PL, the schedule is guaranteed to be serializable and deadlock-free.
Q3. What is the “lock point” in 2PL?
A1. b) lock(X); lock(Y); read(X); write(Y); unlock(X); unlock(Y); commit. Here, all locks [lock(X), lock(Y)] come before the first unlock [unlock(X)]. This satisfies 2PL. In option a, lock(Y) comes after unlock(X) β violates 2PL. In option c, unlock(X) comes before lock(Y) is complete β wait, actually lock(Y) comes before unlock(X), but then unlock(X) is the first unlock, and no new lock follows… Actually let me re-check option c: lock(X); read(X); lock(Y); unlock(X); write(Y); unlock(Y). All locks (lock(X), lock(Y)) come before first unlock (unlock(X)), and after unlock(X), only unlock(Y) follows β no new locks. So option c ALSO follows 2PL. But the best and clearest answer is b. Actually both b and c are correct, but b is the textbook-clean example.
A2. False. 2PL guarantees serializability but does NOT guarantee deadlock freedom. Deadlocks can absolutely occur in 2PL when two transactions each hold a lock and wait for the other’s lock.
A3. The lock point is the point in the transaction timeline where the growing phase ends and the shrinking phase begins. It occurs right after the transaction has acquired its last lock. After this point, no new locks can be acquired β only releases happen.
7. Variations of Two-Phase Locking
The basic 2PL we just learned guarantees serializability but has problems: it can produce non-recoverable schedules and non-strict schedules. To fix this, we have three important variations.
7.1 Conservative 2PL (Static 2PL)
This variation requires a transaction to lock ALL items it will ever need BEFORE it begins execution. The transaction must pre-declare its read-set (all items it will read) and write-set (all items it will write). If even one item cannot be locked, the transaction locks nothing and waits until all items become available.
β’ Pre-locks ALL items before execution
β’ Deadlock-free β because a transaction either gets ALL locks or NONE (no partial waiting, so no circular wait)
β’ Not practical β it’s usually impossible to know all items a transaction will need before it starts running
7.2 Strict 2PL
This is the most popular variation used in practice. The rule: a transaction does not release any of its exclusive (write) locks until it commits or aborts.
What about shared (read) locks? Strict 2PL allows releasing shared locks before commit. But exclusive locks must be held until the very end.
This guarantees a strict schedule β meaning no other transaction can read or write uncommitted data. This makes the schedule recoverable and prevents cascading rollbacks.
However, Strict 2PL is NOT deadlock-free.
7.3 Rigorous 2PL
This is even more restrictive than Strict 2PL. The rule: a transaction does not release any locks at all β neither shared nor exclusive β until it commits or aborts.
All locks are held until commit/abort, then all are released together. This is actually easier to implement than Strict 2PL because the system doesn’t need to track which locks are shared vs exclusive for release timing β it just releases everything at the end.
β’ Conservative 2PL = deadlock-free but impractical
β’ Strict 2PL = most popular in practice, NOT deadlock-free, holds exclusive locks till commit
β’ Rigorous 2PL = holds ALL locks till commit, easiest to implement, NOT deadlock-free
β’ All three guarantee serializability AND strict schedules
Q1 (MCQ). Which 2PL variation is the most widely used in practice?
a) Conservative 2PL
b) Basic 2PL
c) Strict 2PL
d) Rigorous 2PL
Q2 (MCQ). Why is Conservative 2PL considered impractical?
a) It does not guarantee serializability
b) It is not possible in most cases to determine the read-set and write-set before execution
c) It allows deadlocks to occur
d) It releases all locks too early, causing dirty reads
Q3 (MCQ). Which of the following is TRUE about Rigorous 2PL?
a) It releases shared locks before commit but holds exclusive locks until commit
b) It is deadlock-free
c) It holds all locks (shared and exclusive) until commit or abort
d) It requires pre-declaration of read and write sets
Q4. Differentiate between Strict 2PL and Rigorous 2PL. Which one is easier to implement and why?
A1. c) Strict 2PL. Strict 2PL is the most popular variation used in real database systems because it guarantees strict schedules (preventing dirty reads and cascading rollbacks) while being practical to implement.
A2. b) It is not possible in most cases to determine the read-set and write-set before execution. Conservative 2PL requires the transaction to know every data item it will ever access before it starts. In real applications, this is usually impossible because which data items are needed often depends on the results of intermediate computations and queries.
A3. c) It holds all locks (shared and exclusive) until commit or abort. This is the defining characteristic of Rigorous 2PL β it’s more restrictive than Strict 2PL, which only requires holding exclusive locks until commit. Rigorous 2PL holds EVERYTHING until commit/abort.
A4. Strict 2PL holds exclusive locks until commit but can release shared locks earlier. Rigorous 2PL holds ALL locks (both shared and exclusive) until commit or abort. Rigorous 2PL is easier to implement because the system doesn’t need to differentiate between lock types when deciding when to release. It simply releases all locks at commit/abort time in one batch, which simplifies the lock manager’s logic.
8. Deadlocks
Now we come to one of the most feared problems in database systems β deadlocks. Have you ever been in a situation where two people are standing face to face in a narrow corridor, and neither will step aside? Both are waiting for the other to move, and neither can move. That’s a deadlock!
In databases, a deadlock occurs when two or more transactions wait indefinitely for each other to unlock data.
Classic Deadlock Example
Important point: Deadlocks are possible ONLY when at least one transaction wants an exclusive lock. No deadlock condition can exist among shared locks alone. Multiple readers never block each other.
Conditions for Deadlock
For a deadlock to occur, four conditions must hold simultaneously (these are standard OS conditions applied to databases):
- Mutual Exclusion: At least one data item is held in a non-shareable mode (exclusive lock)
- Hold and Wait: A transaction holds at least one lock and is waiting for another
- No Preemption: Locks cannot be forcibly taken away from a transaction
- Circular Wait: A cycle exists in the wait-for relationship (T1 waits for T2, T2 waits for T1)
8.1 Deadlock Prevention Protocols
a) Using Conservative 2PL
We already learned this. Since a transaction either gets ALL locks or NONE, there is no partial waiting. This breaks the Hold and Wait condition. But it’s not practical.
b) Transaction Timestamps
Each transaction gets a unique timestamp TS(T) based on when it started. If T1 started before T2, then TS(T1) < TS(T2) β T1 is “older” and T2 is “younger.” When a conflict occurs, the timestamp is used to decide which transaction waits and which aborts. This breaks the Circular Wait condition by imposing a strict ordering.
c) No-Waiting (NW) Protocol
If a transaction cannot obtain a lock, it is immediately aborted and restarted after a delay. No waiting at all! This breaks the Wait condition. The problem? Transactions may be aborted needlessly even when no deadlock would have occurred.
d) Cautious Waiting Protocol
This is a smarter version. When transaction Ti tries to lock item X but X is locked by Tj with a conflicting lock:
- If Tj is NOT blocked (not waiting for anything else), then Ti is allowed to wait
- If Tj IS blocked (waiting for some other item), then Ti is aborted
This reduces unnecessary aborts compared to the No-Waiting protocol while still preventing deadlock.
8.2 Deadlock Detection Protocols
Instead of preventing deadlocks, we let them happen and then detect them. The DBMS periodically checks for deadlocks. When found, one transaction (the victim) is aborted and restarted.
The detection uses a Wait-For Graph (WFG):
Victim Selection is the process of choosing which transaction to abort. Common criteria include: the transaction that has done the least work, the transaction that is youngest, or the one that has been aborted the fewest times.
β’ If the probability of deadlocks is low β use deadlock detection (less overhead during normal operation)
β’ If the probability of deadlocks is high β use deadlock prevention (avoid the cost of frequent rollbacks)
8.3 Timeouts
The simplest approach: if a transaction waits longer than a system-defined timeout period, it is aborted. No graph building, no complex detection. Very low overhead.
The downside? A transaction might be aborted even if it was NOT deadlocked β it was just slow. This is a crude method but practical due to its simplicity.
8.4 Starvation
Starvation is different from deadlock. In starvation, a transaction cannot proceed for an indefinite period while other transactions continue normally. It’s like being at a counter where people keep cutting in front of you β you never get served.
Causes:
- Unfair waiting scheme β some transactions always get priority
- Victim selection algorithm keeps choosing the same transaction as the victim repeatedly
Solutions:
- Use a first-come-first-served (FCFS) queue for lock requests
- Use priorities but increase a transaction’s priority the longer it waits β eventually it gets the highest priority
- Give higher priority to transactions that have been aborted multiple times
β’ Deadlock = transactions waiting for EACH OTHER (circular wait)
β’ Starvation = ONE transaction keeps waiting while others proceed
β’ Deadlock prevention: Conservative 2PL, timestamps, No-Waiting, Cautious Waiting
β’ Deadlock detection: Wait-For Graph, look for cycles
β’ Timeouts: simple but may abort non-deadlocked transactions
β’ Starvation solution: FCFS queue or aging (increasing priority over time)
Q1 (MCQ). In the Wait-For Graph, a deadlock is indicated by:
a) A node with no edges
b) A directed edge from Ti to Tj
c) A cycle in the graph
d) A node with multiple incoming edges
Q2 (MCQ). Under the Cautious Waiting Protocol, if Ti requests a lock held by Tj, and Tj is currently blocked (waiting for another lock), what happens to Ti?
a) Ti waits for Tj to finish
b) Ti is aborted
c) Ti preempts Tj’s lock
d) Ti is given higher priority than Tj
Q3 (MCQ). Which deadlock control method is recommended when the probability of deadlocks is low?
a) Conservative 2PL
b) No-Waiting Protocol
c) Deadlock Detection
d) Cautious Waiting Protocol
Q4. Draw a Wait-For Graph for the following scenario and determine if a deadlock exists: T1 waits for T2, T2 waits for T3, T3 waits for T4, T4 waits for T2.
Q5 (True/False). Starvation and deadlock are the same problem.
A1. c) A cycle in the graph. A cycle (e.g., T1βT2βT3βT1) means each transaction is waiting for the next one in the cycle, creating a circular wait β which is the condition for deadlock.
A2. b) Ti is aborted. Under Cautious Waiting, if the lock holder (Tj) is itself blocked, we abort Ti to prevent potential deadlock formation. If Tj were NOT blocked, Ti would be allowed to wait.
A3. c) Deadlock Detection. When deadlocks are rare, it’s more efficient to let transactions run freely and check for deadlocks occasionally, rather than imposing prevention overhead on every single lock request.
A4. The Wait-For Graph would be: T1βT2, T2βT3, T3βT4, T4βT2. The edges T2βT3βT4βT2 form a cycle. Therefore, a deadlock EXISTS. T1 is NOT part of the deadlock cycle (it’s just waiting for T2), but T2, T3, and T4 are deadlocked. The system needs to abort one of T2, T3, or T4 to break the cycle.
A5. False. Deadlock involves a circular wait where transactions block each other. Starvation involves a single transaction being indefinitely delayed while others proceed. In deadlock, NO transaction in the cycle can proceed. In starvation, the system as a whole is still making progress β just not for the starved transaction.
9. Timestamp Ordering (TO) Concurrency Control
Now let’s move to a completely different approach. Unlike locking, timestamp ordering doesn’t use locks at all. Instead, each transaction gets a timestamp, and the system ensures that operations are executed in timestamp order.
9.1 What is a Timestamp?
A timestamp TS(T) is a unique identifier assigned to transaction T. It is a monotonically increasing number that indicates the age of the transaction β older transactions have smaller timestamps. Timestamps can be generated in two ways:
- Counter timestamp: A system counter that increments by 1 for each new transaction (1, 2, 3, …). Must be reset when it reaches its maximum value.
- Date timestamp: The current system clock value, ensuring no two timestamps are generated in the same clock tick.
9.2 The Timestamp Ordering Algorithm
The idea is beautifully simple: order transactions by their timestamps and ensure no conflicting operation violates this order. The only allowed serial schedule has transactions in timestamp order.
For each data item X, the system maintains two values:
Now when a transaction T with timestamp TS(T) tries to read or write X:
Worked Example:
Three transactions: TS(T1)=5, TS(T2)=10, TS(T3)=15. Item X has RTS(X)=0, WTS(X)=0 initially.
9.3 Advantages and Disadvantages
β’ Schedules are serializable (like 2PL)
β’ No waiting, no deadlocks! (transactions are aborted, not made to wait)
β Disadvantages:
β’ May produce non-recoverable schedules (a transaction might read uncommitted data)
β Solution: Use Strict TO Algorithm
β’ Starvation is possible if the same transaction keeps getting aborted
β Solution: Assign a new (larger) timestamp when restarting an aborted transaction
Q1 (MCQ). In the Timestamp Ordering algorithm, if TS(T) < WTS(X) when T issues a read(X), what happens?
a) T reads X and RTS(X) is updated
b) T is forced to wait until WTS(X) decreases
c) T is aborted and restarted
d) The write that set WTS(X) is undone
Q2 (MCQ). What is the main reason timestamp ordering protocols do not produce deadlocks?
a) They use Conservative 2PL internally
b) Transactions are aborted rather than made to wait
c) They use a wait-for graph to detect cycles early
d) Timestamps prevent circular wait by ordering all transactions
Q3. Item Y has RTS(Y)=20 and WTS(Y)=15. Transaction T4 with TS(T4)=25 wants to write Y. Transaction T5 with TS(T5)=12 wants to read Y. What happens in each case?
A1. c) T is aborted and restarted. If TS(T) < WTS(X), it means a younger transaction (with a larger timestamp) has already written X. Allowing T to read X would produce a result inconsistent with the timestamp order. So T must be aborted.
A2. b) Transactions are aborted rather than made to wait. In timestamp ordering, there is no waiting mechanism. If a conflict violates timestamp order, the offending transaction is immediately aborted. Since no transaction ever waits for another, the circular wait condition for deadlock can never arise.
A3.
T4 write(Y): TS(T4)=25. Check: 25 β₯ RTS(Y)=20? YES. 25 β₯ WTS(Y)=15? YES. β Execute write. Update WTS(Y)=25.
T5 read(Y): TS(T5)=12. Check: 12 β₯ WTS(Y)=25? NO (12 < 25). β Abort T5! A younger transaction (T4, TS=25) has already written Y. T5 (TS=12) is older and cannot read this value without violating timestamp order.
10. Multi-Version Concurrency Control (MVCC)
Have you ever used a document sharing tool like Google Docs where multiple people can edit simultaneously, and each person sees their own “version” of the document? MVCC works on a similar principle.
In MVCC, each user sees a snapshot of the database at a particular point in time. When a transaction writes a data item, the system does NOT overwrite the old data. Instead, it:
- Marks the old version as obsolete
- Creates a new version with the updated value
- Readers continue to see the version that was current when they started reading
The key advantage: A read operation is never rejected in MVCC. Readers never block writers, and writers never block readers. This is a huge improvement in concurrency for read-heavy workloads.
The key disadvantage: The system needs significantly more storage (both RAM and disk) to maintain multiple versions of data items. To prevent unlimited growth, a garbage collection process runs periodically to remove old versions that are no longer needed by any active transaction.
β’ MVCC = multiple versions of each data item
β’ Readers are never blocked/rejected
β’ Old versions are kept for readers who started earlier
β’ Disadvantage: High storage overhead, requires garbage collection
β’ Used widely in practice (PostgreSQL, MySQL InnoDB, Oracle)
Q1 (MCQ). Which statement about MVCC is TRUE?
a) Read operations can be rejected when a writer holds an exclusive lock
b) Only one version of each data item is maintained at any time
c) Read operations are never rejected
d) MVCC uses locks to synchronize reader and writer transactions
Q2. What problem does garbage collection solve in MVCC?
Q3 (True/False). In MVCC, when a transaction updates a data item, the old value is overwritten with the new value.
A1. c) Read operations are never rejected. This is the defining feature of MVCC β readers always find a suitable version to read, so they are never blocked or rejected. Option a describes locking behavior, not MVCC. Option b is wrong because MVCC maintains MULTIPLE versions. Option d is wrong because MVCC does NOT use locks for read-write synchronization.
A2. Garbage collection removes old, obsolete versions of data items that are no longer needed by any active transaction. Without garbage collection, the number of versions would grow indefinitely, consuming excessive storage (RAM and disk). The garbage collector identifies versions that are older than the oldest active transaction’s start time and safely removes them.
A3. False. In MVCC, the old data is NOT overwritten. Instead, the old version is marked as obsolete, and a new version is created with the updated value. This is the fundamental mechanism that allows readers to see consistent snapshots.
11. Optimistic Concurrency Control
Now let’s talk about a completely different philosophy. All the techniques we’ve discussed so far (locking, timestamp ordering) are pessimistic β they assume conflicts WILL happen and take precautions upfront. But what if most transactions actually DON’T conflict?
Think about a library. Most people browse different books. Only rarely do two people want the exact same book at the exact same time. Would it make sense to lock every bookshelf just in case? Of course not!
Optimistic Concurrency Control takes the opposite approach: it assumes conflicts are rare, so it does no checking during transaction execution. Instead, it checks for serializability only at the very end β at commit time.
The Three Phases
Let me explain each phase clearly:
Phase 1 β Read and Execution: The transaction reads values from the database (only committed data). But when it writes, it does NOT write to the actual database. Instead, all writes go to local copies (workspace) kept within the transaction. No locks, no checks, no restrictions.
Phase 2 β Validation: Before committing, the system checks whether the transaction’s changes can be applied without violating serializability. It uses the transaction’s read-set (items it read) and write-set (items it wrote) to determine if any conflicting transaction has committed since this transaction started.
Phase 3 β Write: If validation passes, the local changes are applied to the actual database. If validation fails, all changes are discarded and the transaction is restarted.
β’ When conflicts between transactions are rare (most transactions validated successfully)
β’ When the system is read-heavy with few updates
When is it Worst?
β’ When conflicts are frequent β many transactions execute fully only to be discarded at validation, wasting resources
β’ Optimistic = “no checking during execution, check only at commit”
β’ Pessimistic = “check before every operation (locks or timestamps)”
β’ Three phases: Read & Execution β Validation β Write
β’ If validation fails β discard changes, restart transaction
β’ Called “optimistic” because it optimistically assumes conflicts won’t happen
Q1 (MCQ). In Optimistic Concurrency Control, when is serializability checked?
a) Before every read operation
b) Before every write operation
c) At the time of commit (validation phase)
d) Throughout the entire transaction execution
Q2 (MCQ). What happens to a transaction’s changes if the validation phase fails in Optimistic Concurrency Control?
a) Changes are applied anyway but marked as tentative
b) Changes are discarded and the transaction is restarted
c) Changes are applied to a backup copy of the database
d) The transaction enters a waiting state until conflicts resolve
Q3. Why is locking called “pessimistic” and this approach called “optimistic”? Explain with a real-world analogy.
Q4 (True/False). In the Read & Execution phase of Optimistic Concurrency Control, write operations are applied directly to the database.
A1. c) At the time of commit (validation phase). This is the defining characteristic of optimistic concurrency control β all checking is deferred to the very end, right before commit. During execution, no checks are performed at all.
A2. b) Changes are discarded and the transaction is restarted. When validation fails, it means serializability would be violated if the changes were applied. So the system throws away all the local changes and restarts the transaction from the beginning.
A3. Pessimistic locking is like a security guard who checks everyone’s ID at every door, assuming someone might try to enter without authorization. It’s cautious and adds overhead for every operation. Optimistic control is like letting everyone walk freely through the building and only checking IDs at the exit door. It assumes most people are fine and only catches problems at the end. The optimistic approach is faster when most people are legitimate (few conflicts) but wasteful when many people need to be turned back at the exit.
A4. False. In the Read & Execution phase, write operations are applied ONLY to local copies (workspace) within the transaction. They are NOT written to the actual database. The actual database is only updated in Phase 3 (Write phase), and only if the validation phase succeeds.
Quick Revision Notes β Exam Focus
1. Concurrency Control
β’ Goal: Guarantee serializability of schedules
β’ Main approach: Locking-based protocols
2. Lock
β’ Pessimistic locking: Assumes conflicts are likely, locks proactively
β’ Lock manager subsystem handles all locks automatically
3. Lock Granularity
| Level | What is Locked | Concurrency | Overhead | Used in Practice? |
|---|---|---|---|---|
| Database | Entire DB | Worst | Lowest | No |
| Table | Entire table | Very poor | Low | No |
| Page | Disk page/block | Good | Moderate | YES (most common) |
| Row | Single row | Very good | High | Yes |
| Field | Single attribute | Best | Highest | No (rarely) |
4. Binary Locks
β’ Enforces mutual exclusion β only one transaction at a time
β’ Too restrictive β not used in practice
β’ Eliminates lost update problem
5. Shared/Exclusive Locks
β’ Exclusive (X-lock): Only ONE transaction, no other locks allowed
β’ Uses no_of_reads counter for shared locks
β’ Upgrading: Shared β Exclusive (must be only reader)
β’ Downgrading: Exclusive β Shared (always safe)
6. Two-Phase Locking (2PL)
Growing Phase: Acquire locks only
Lock Point: All locks acquired
Shrinking Phase: Release locks only
Guarantees: Serializability β
Does NOT guarantee: Deadlock freedom β, Strict schedule β, Recoverability β
7. 2PL Variations
| Variation | Lock Rule | Deadlock-free? | Practical? |
|---|---|---|---|
| Conservative | Pre-lock ALL items before execution | YES | No |
| Strict | Hold exclusive locks until commit/abort | No | YES (most used) |
| Rigorous | Hold ALL locks until commit/abort | No | Yes (easier) |
8. Deadlocks
Only possible when: At least one exclusive lock is involved
Four conditions: Mutual Exclusion, Hold & Wait, No Preemption, Circular Wait
Prevention:
β’ Conservative 2PL (impractical)
β’ Timestamps (older/younger decision)
β’ No-Waiting: abort immediately if lock unavailable
β’ Cautious Waiting: abort only if lock-holder is blocked
Detection: Wait-For Graph β cycle = deadlock β abort victim
Timeouts: Abort if wait exceeds time limit (simple but imprecise)
9. Starvation
Solutions: FCFS queue, aging (increase priority over time), priority boost for frequently-aborted transactions
10. Timestamp Ordering
β’ May not be recoverable β use Strict TO
β’ Starvation possible β assign new timestamp on restart
11. MVCC
β’ Readers see snapshot from their start time
β’ Reads never rejected
β’ Old versions not overwritten, new versions created
β’ Disadvantage: High storage overhead, needs garbage collection
12. Optimistic Concurrency Control
1. Read & Execution β read committed data, write to local copies only
2. Validation β check serializability at commit time
3. Write β apply if valid, discard and restart if invalid
β’ Best when conflicts are rare
β’ Worst when conflicts are frequent (wasted work)
Common Mistakes in Exams
β Mistake 2: Confusing deadlock with starvation β Deadlock = circular wait; Starvation = one transaction delayed
β Mistake 3: Saying “binary locks allow concurrent reads” β NO, they allow only one transaction at a time
β Mistake 4: Forgetting that Strict 2PL holds only EXCLUSIVE locks till commit (shared can be released earlier)
β Mistake 5: Saying MVCC uses locks for readers β NO, reads are never rejected in MVCC
β Mistake 6: Confusing pessimistic with optimistic β Pessimistic checks before operations; Optimistic checks only at commit
β Mistake 7: Saying deadlocks can occur with only shared locks β NO, exclusive lock is required
Challenge Exam Questions
These questions are designed to test your deep understanding. Try them before looking at the answers!
Section A: Multiple Choice Questions
Q1. A schedule has three transactions T1, T2, T3 following basic 2PL. T1 holds lock(A) and requests lock(B). T2 holds lock(B) and requests lock(C). T3 holds lock(C) and requests lock(A). Which of the following is TRUE?
a) The schedule is serializable and deadlock-free
b) The schedule is serializable but a deadlock exists
c) The schedule is not serializable because 2PL is violated
d) The schedule is not serializable but no deadlock exists
b) The schedule is serializable but a deadlock exists. Since all three transactions follow 2PL, the schedule is guaranteed to be serializable. However, T1βT2βT3βT1 forms a circular wait, creating a deadlock. This demonstrates that 2PL guarantees serializability but NOT deadlock freedom.
Q2. In a database system using page-level locking with lock promotion, a transaction has acquired row-level locks on 15 rows that happen to reside on the same disk page. What is the DBMS most likely to do?
a) Continue with row-level locks to maximize concurrency
b) Promote all row locks to a single page-level lock to reduce overhead
c) Abort the transaction for requesting too many locks
d) Request a table-level lock instead
b) Promote all row-level locks to a single page-level lock to reduce overhead. Modern DBMSs automatically upgrade from row-level to page-level locking when multiple locks are requested on the same page. This reduces the management overhead of maintaining many individual row locks while still providing reasonable concurrency.
Q3. Transaction T7 (TS=100) wants to write item X. RTS(X)=80 and WTS(X)=120. What does the basic Timestamp Ordering algorithm do?
a) Execute the write and set WTS(X)=100
b) Execute the write and set WTS(X)=max(100,120)=120
c) Abort T7 because TS(T7) < WTS(X)
d) Make T7 wait until WTS(X) changes
c) Abort T7 because TS(T7) < WTS(X). For a write operation, the condition is: if TS(T) < RTS(X) OR TS(T) < WTS(X), abort T. Here, TS(T7)=100 < WTS(X)=120, so T7 is aborted. A younger transaction (with TS=120) has already written X, so allowing T7 (older, TS=100) to write would violate the timestamp order.
Q4. Which of the following statements about Strict 2PL and Rigorous 2PL is CORRECT?
a) Both are deadlock-free protocols
b) Strict 2PL holds all locks until commit; Rigorous 2PL holds only exclusive locks until commit
c) Strict 2PL holds exclusive locks until commit; Rigorous 2PL holds all locks until commit
d) Rigorous 2PL is less restrictive than Strict 2PL
c) Strict 2PL holds exclusive locks until commit; Rigorous 2PL holds all locks until commit. This is the fundamental difference. Strict 2PL allows releasing shared locks before commit. Rigorous 2PL holds BOTH shared and exclusive locks until commit or abort, making it more restrictive but easier to implement.
Q5. A Wait-For Graph has the following edges: T1βT2, T2βT3, T3βT4, T4βT2. Which transactions are deadlocked?
a) All four transactions
b) Only T2, T3, and T4
c) Only T1 and T2
d) No deadlock exists
b) Only T2, T3, and T4. The cycle is T2βT3βT4βT2. These three transactions are in a circular wait and are deadlocked. T1 is NOT part of the cycle β T1 waits for T2, but nobody waits for T1. T1 is merely blocked (not deadlocked). If the deadlock is resolved by aborting one of T2, T3, or T4, T1 may eventually proceed.
Q6. Under the Cautious Waiting Protocol, transaction Ti requests a write lock on item Q currently held in shared mode by Tj. Tj is NOT blocked. What happens?
a) Ti is aborted immediately
b) Ti is allowed to wait for Tj to release the lock
c) Ti preempts Tj’s shared lock
d) Ti’s request is ignored until Tj commits
b) Ti is allowed to wait for Tj to release the lock. The Cautious Waiting rule states: if the lock holder (Tj) is NOT blocked, then Ti is allowed to wait. Ti would only be aborted if Tj were itself blocked (waiting for some other lock), because that could lead to a deadlock chain.
Q7. In MVCC, why can a read operation never be rejected?
a) Because all reads acquire shared locks that are always compatible
b) Because the system always has a suitable version of the data item available for the reader
c) Because reads are applied to local copies only
d) Because MVCC uses optimistic validation before reads
b) Because the system always has a suitable version of the data item available for the reader. MVCC maintains multiple versions of each data item. When a transaction reads, the system provides the version that was current when that transaction started. Since old versions are preserved (not overwritten), there is always a version available for every active reader, so reads are never rejected.
Q8. Which concurrency control technique is most suitable for an environment with very few conflicts between transactions?
a) Strict 2PL with exclusive locking
b) Conservative 2PL
c) Optimistic Concurrency Control
d) No-Waiting Protocol
c) Optimistic Concurrency Control. Optimistic control assumes conflicts are rare and defers all checking to commit time. When conflicts are indeed few, most transactions pass validation and commit successfully, making this approach very efficient. Pessimistic techniques like 2PL would add unnecessary overhead (locking and waiting) when conflicts rarely occur.
Section B: True/False
Q9. In binary locking, a transaction can read a data item that is locked by another transaction as long as it only reads (not writes).
False. In binary locking, there is no distinction between read and write operations. A binary lock has only two states: locked or unlocked. If LOCK(X)=1, NO other transaction can access X for any purpose β neither reading nor writing. Only shared/exclusive locks allow concurrent reads.
Q10. The Rigorous 2PL protocol is easier to implement than Strict 2PL.
True. In Rigorous 2PL, ALL locks are held until commit/abort and then released together. The system doesn’t need to track which locks are shared vs exclusive for release timing β it just releases everything at once. In Strict 2PL, the system must differentiate: shared locks can be released early, but exclusive locks must be held until commit. This additional tracking makes Strict 2PL slightly more complex to implement.
Q11. Deadlock prevention protocols are recommended when the probability of deadlocks is low.
False. When the probability of deadlocks is low, deadlock detection is recommended. Prevention protocols add overhead to every lock request (checking timestamps, aborting transactions, etc.), which is wasteful if deadlocks rarely occur. Detection allows transactions to run freely and only checks periodically, which is more efficient when conflicts are rare.
Q12. In timestamp ordering, if a transaction is aborted, it should be restarted with the same timestamp to maintain fairness.
False. If an aborted transaction is restarted with the same (old) timestamp, it will likely be aborted again because younger transactions have since updated the data items. The correct approach is to assign a new, larger timestamp to the restarted transaction so it is treated as a “new” younger transaction. This also helps prevent starvation.
Section C: Fill in the Blanks
Q13. The ________ level of lock granularity is currently the most frequently used in multiuser DBMSs.
Page level. Page-level locking locks an entire disk block, allowing different transactions to access different pages of the same table. It offers the best balance between concurrency and management overhead in practice.
Q14. In shared/exclusive locking, the ________ variable tracks the number of transactions currently holding a shared lock on a data item.
no_of_reads (or number of reads counter). This counter increments each time a new transaction acquires a shared lock and decrements when a shared lock is released. The item transitions to “unlocked” only when this counter reaches zero.
Q15. Converting a lock from shared to exclusive is called ________, and the transaction must be the ________ holder of the shared lock to do this.
Upgrading; only. Upgrading converts a shared (read) lock to an exclusive (write) lock. This is only possible if the requesting transaction is the sole holder of the shared lock. If other transactions are also reading the item, the upgrade request must wait until they release their shared locks.
Q16. A ________ in the Wait-For Graph indicates that a deadlock exists.
Cycle. A directed cycle (e.g., T1βT2βT3βT1) means each transaction in the cycle is waiting for the next one, creating a circular wait β the essential condition for deadlock.
Q17. In Optimistic Concurrency Control, during the Read & Execution phase, all write operations are applied to ________ copies only.
Local copies (also called workspace copies or private copies). The actual database is NOT modified during this phase. Changes are stored locally within the transaction and are only applied to the database if the validation phase succeeds.
Section D: Short Answer Questions
Q18. Explain why deadlocks cannot occur among transactions that only hold shared locks. Use a compatibility argument.
Shared locks are compatible with each other. If T1 holds a shared lock on X and T2 requests a shared lock on X, T2’s request is granted immediately β T2 does not need to wait. Since no transaction ever waits for a shared lock held by another, there can be no wait-for relationship, and therefore no circular wait. Deadlock requires at least one transaction to request an exclusive lock on an item already locked by another transaction, because only exclusive locks are incompatible with other locks (both shared and exclusive).
Q19. Compare the No-Waiting Protocol and the Cautious Waiting Protocol. Under what specific condition does Cautious Waiting abort a transaction that No-Waiting would also abort?
No-Waiting Protocol: If a transaction cannot obtain a lock, it is ALWAYS aborted immediately, regardless of the situation. No checking, no waiting.
Cautious Waiting Protocol: If Ti cannot lock X (held by Tj with a conflicting lock), Ti is allowed to wait ONLY if Tj is NOT blocked. If Tj IS blocked, Ti is aborted.
Common case: Both protocols abort Ti when Tj (the lock holder) is itself blocked. In this situation, No-Waiting aborts Ti without checking (blindly), while Cautious Waiting specifically identifies that Tj being blocked could lead to a deadlock chain and aborts Ti as a preventive measure. The key difference is that Cautious Waiting allows Ti to wait when Tj is NOT blocked (a safe situation), whereas No-Waiting never allows waiting at all.
Q20. A database administrator observes that a particular transaction is repeatedly being chosen as the victim in the deadlock detection’s victim selection algorithm. What problem does this indicate, and what solution would you recommend?
This indicates starvation. The transaction cannot make progress because it keeps getting aborted and restarted, while other transactions continue normally. This happens because the victim selection algorithm consistently picks the same transaction (perhaps because it’s the youngest, or has done the least work).
Solutions:
1. Modify the victim selection: Give higher priority to transactions that have been aborted multiple times. Each time a transaction is chosen as a victim, increase its priority for future selections.
2. Use aging: Increase the transaction’s priority the longer it has been waiting or the more times it has been restarted, until it eventually becomes the highest priority and is allowed to complete.
3. Track abort history: The algorithm can explicitly avoid selecting a transaction that has been a victim within the last N attempts.
Q21. Given the following transaction schedule, determine whether each transaction follows 2PL. Explain your reasoning for each.
T1: Follows 2PL β β All locks [lock(A), lock(B)] come before the first unlock [unlock(A)]. After unlock(A), only unlock(B) follows (no new locks). Clear two-phase structure.
T2: Does NOT follow 2PL β β The first unlock is unlock(B). After unlock(B), a new lock [lock(C)] is acquired. This violates the 2PL rule that no new locks can be acquired after the first unlock.
T3: Follows 2PL β β All locks [lock(A), lock(B), lock(C)] come before the first unlock [unlock(C)]. After unlock(C), only unlock(B) and unlock(A) follow (no new locks). This is actually a Rigorous 2PL pattern since all locks are held until the end.
Q22. Explain how garbage collection works in MVCC and why it is necessary. What would happen without it?
In MVCC, every write operation creates a new version of a data item while the old version is retained for readers who started earlier. Over time, the number of versions grows continuously. Garbage collection is a background process that periodically identifies and removes old versions that are no longer needed.
A version can be safely removed when no active transaction needs it β specifically, when the version is older than the start time of the oldest active transaction. At that point, no transaction could possibly need to read that old version.
Without garbage collection:
β’ Storage (RAM and disk) usage would grow indefinitely
β’ Query performance would degrade because the system would need to scan through increasingly many versions
β’ Eventually, the system could run out of storage, causing crashes or forced transaction aborts
β’ Index structures would become bloated and inefficient
Q23. Consider a system using basic Timestamp Ordering. Initially, RTS(X)=0 and WTS(X)=0. The following operations occur in order: T1(TS=10) writes X, T2(TS=20) reads X, T3(TS=5) reads X, T4(TS=15) writes X. Show the state of RTS(X) and WTS(X) after each operation, and indicate which operations succeed and which fail.
Initial: RTS(X)=0, WTS(X)=0
T1 writes X (TS=10): Check: 10β₯RTS(X)=0 AND 10β₯WTS(X)=0 β Execute β. WTS(X)=10. State: RTS=0, WTS=10
T2 reads X (TS=20): Check: 20β₯WTS(X)=10 β Execute β. RTS(X)=max(0,20)=20. State: RTS=20, WTS=10
T3 reads X (TS=5): Check: 5β₯WTS(X)=10? NO (5<10) β Abort T3 β. A younger transaction (T1, TS=10) already wrote X. State unchanged: RTS=20, WTS=10
T4 writes X (TS=15): Check: 15β₯RTS(X)=20? NO (15<20) β Abort T4 β. A younger transaction (T2, TS=20) already read X. State unchanged: RTS=20, WTS=10
Final state: RTS(X)=20, WTS(X)=10. Only T1 and T2 succeeded. T3 and T4 were aborted because younger transactions had already accessed X.
Q24. A student argues: “Since Conservative 2PL is deadlock-free and guarantees serializability, it should be the preferred protocol in all database systems.” Refute this argument with at least two specific reasons.
The student’s argument is wrong for the following reasons:
Reason 1: Impracticality. Conservative 2PL requires pre-declaring the complete read-set and write-set of each transaction BEFORE it begins execution. In real database applications, this is usually impossible. Which rows a transaction needs to read often depends on the results of intermediate queries, search conditions evaluated at runtime, and user input received during execution. You cannot always predict what data you’ll need before you start.
Reason 2: Reduced concurrency despite being deadlock-free. Conservative 2PL requires locking ALL items upfront. If a transaction needs 100 items, it must lock all 100 before reading even the first one. This means items are locked for longer than necessary (they’re locked even before the transaction actually uses them), which reduces concurrency compared to protocols like Strict 2PL where items are locked only when needed.
Reason 3 (bonus): Deadlock prevention comes at the cost of potentially MORE aborts/restarts. If even one item in the pre-declared set is unavailable, the transaction locks NOTHING and waits. In a busy system, this could lead to many transactions repeatedly failing to acquire all their locks, causing delays. Other approaches like deadlock detection are more efficient when deadlocks are rare.