You are on page 1of 15

INTRODUCTION

Concurrency control is a very important issue in distributed database system


design. This is because concurrency allows many transactions to be executing
simultaneously such that collection of manipulated data item is left in a consistent
state. Database concurrency control permits users to access a database in a multi
programmed fashion while preserving the illusion that user is executing alone on a
dedicated system. Besides, it produces the same effect and has the same output on
the database as some serial execution of the same transaction.
Concurrency control in distributed system is achieved by a program which is called
scheduler. Scheduler help to order the operations of transaction in such a way that
the resulting logs is serializable. There have two type of the concurrency control
that are locking approach and non-locking approach. In term of locking approach,
two-phase lock is widely used and purpose for centralized or distributed database
system. Before distributed database systems accessing some part of database, it
must adopt a locking mechanism such as each transaction has to obtain a lock.
After that part is locked by other transaction, the access request will be block and
the transaction who making reauest has to wait. Two-phase locking will cause the
deadlock problem. This locking induces high communication cost because of the
deadlock problem.
Whereas in term of non-locking approach which is divided into two type which are
Timestamp ordering (TO) and Serialization Graph Testing (SGT). Timestamp
ordering utilizes unique transaction timestamp for determining the serialization
order. It divided into three different modes that are basic, conservative and
optimistic modes. For Serialization Graph Testing, it has been most attractive log
class until 1987 because it is known as the largest known class of serializable logs.
There are some limitations that can found in those two approaches. However, there
are some solution that can used to solve the problem and will be discuss more
detail in the below.
WHAT IS DATABASE CONCURRENCY
Database concurrency is the ability of a database to allow multiple users to affect
multiple transactions. This is one of the main properties that separates a database
from other forms of data storage, like spreadsheets. Other users can read the file,
but may not edit data.

DATABASE TRANSACTION AND THE ACID RULES


The concept of a database transaction (or atomic transaction) has evolved in order
to enable both a well understood database system behavior in a faulty environment
where crashes can happen any time, and recovery from a crash to a well
understood database state. A database transaction is a unit of work, typically
encapsulating a number of operations over a database (e.g., reading a database
object, writing, acquiring lock, etc.), an abstraction supported in database and also
other systems. Each transaction has well defined boundaries in terms of which
program/code executions are included in that transaction (determined by the
transaction's programmer via special transaction commands). Every database
transaction obeys the following rules (by support in the database system; i.e., a
database system is designed to guarantee them for the transactions it runs):

 Atomicity - Either the effects of all or none of its operations remain ("all or
nothing" semantics) when a transaction is completed
(committed or aborted respectively). In other words, to the outside world a
committed transaction appears (by its effects on the database) to be indivisible
(atomic), and an aborted transaction does not affect the database at all. Either
all the operations are done or none of them are.
 Consistency - Every transaction must leave the database in a consistent
(correct) state, i.e., maintain the predetermined integrity rules of the database
(constraints upon and among the database's objects). A transaction must
transform a database from one consistent state to another consistent state
(however, it is the responsibility of the transaction's programmer to make sure
that the transaction itself is correct, i.e., performs correctly what it intends to
perform (from the application's point of view) while the predefined integrity
rules are enforced by the DBMS). Thus since a database can be normally
changed only by transactions, all the database's states are consistent.
 Isolation - Transactions cannot interfere with each other (as an end result of
their executions). Moreover, usually (depending on concurrency control
method) the effects of an incomplete transaction are not even visible to another
transaction. Providing isolation is the main goal of concurrency control.
 Durability - Effects of successful (committed) transactions must persist
through crashes (typically by recording the transaction's effects and its commit
event in a non-volatile memory).
The concept of atomic transaction has been extended during the years to what has
become Business transactions which actually implement types of Workflow and
are not atomic. However also such enhanced transactions typically utilize atomic
transactions as components.

WHY IS CONCURRENCY CONTROL NEEDED


If transactions are executed serially, i.e., sequentially with no overlap in time, no
transaction concurrency exists. However, if concurrent transactions with
interleaving operations are allowed in an uncontrolled manner, some unexpected,
undesirable results may occur, such as:
1. The lost update problem: A second transaction writes a second value of a
data-item (datum) on top of a first value written by a first concurrent
transaction, and the first value is lost to other transactions running
concurrently which need, by their precedence, to read the first value. The
transactions that have read the wrong value end with incorrect results.
2. The dirty read problem: Transactions read a value written by a transaction
that has been later aborted. This value disappears from the database upon
abort, and should not have been read by any transaction ("dirty read"). The
reading transactions end with incorrect results.
3. The incorrect summary problem: While one transaction takes a summary
over the values of all the instances of a repeated data-item, a second
transaction updates some instances of that data-item. The resulting summary
does not reflect a correct result for any (usually needed for correctness)
precedence order between the two transactions (if one is executed before the
other), but rather some random result, depending on the timing of the
updates, and whether certain update results have been included in the
summary or not.
Most high-performance transactional systems need to run transactions concurrently
to meet their performance requirements. Thus, without concurrency control such
systems can neither provide correct results nor maintain their databases
consistently.

TYPE AND THE ISSUE OF CONCURRENCY CONTROL


 Two -phase locking
Two -phase locking is a widely used concurrency control technique to
synchronize accesses to shared item. Each data item has a lock associated
with it. The scheduler will first examine the associated lock before a
transaction (T1) may access a data item. If there is no transaction holds the
lock then the scheduler will obtains the lock on behalf of T1. When another
transaction T2 hold the lock, the T1 has to wait until T2 to gives up the lock.
The scheduler will give the T1 the lock after the T2 release the lock.
Scheduler must ensure that there is only one transaction can hold the lock at
a time and only one transaction can access data item at a time. They are two
types of locks associate with data item that is read locks and write locks.
Figure 1 below show that the transaction T1 and T2 follow the two-phase
locking protocol.
Figure 1: Transaction T1 and T2 with 2-phase locking protocol
However, there is an important and unfortunate property of two-phase
locking schedulers is that they are subject to deadlocks. For instances, the
classic deadlock situation in which neither of two processes can proceed that
is one must release a resource and the other one needs to proceed. Besides
that, deadlock also arises when the transaction try to strengthen read locks to
write locks. To ensure that there is no transaction is blocked forever, the
scheduler needs a strategy for detecting the deadlock.
In addition, two-phase locking is suffer from sensible delays due to a node
needs to send message to all nodes and must receive acknowledgement from
all nodes to discover a deadlock. The messages will exchange between
nodes to get decision of transaction’s commit. The whole system will stop
since not discover the deadlock immediately and a natural of distributed
system make deadlock treatment very difficult. Moreover, the message that
due to commit and deadlock treatment make high traffic on network which
is makes need of special kind of networking.
Figure 2: A spatial schedules of T1 and T2 in state of deadlock
T1 wait for T2 to release read lock on y2
T3 wait for T1 to release read lock on x1
T2 wait for T3 to release read lock on z3
Figure 3: A wait-for graph for the spatial schedule of Figure 2
There are two type of graph that show as above which is a spatial schedule
of T1 and T2 in the state of deadlock. Another graph show that the wait for
graph for the spatial schedule of the figure 2 above. Each node of the graph
represent the transaction, the edges represent the “waiting for” relationship.
Furthermore, concurrency control also deals with starvation. The starvation
occurs when a particular transaction consistently waits or restarted and never
gets a chance to proceed further. In a deadlock resolution it is possible the
same transaction may consistently be selected as victim and rolled-back.
This type of limitation is inherent in all priority based scheduling
mechanism. The wound-wait scheme a younger transaction may always be
aborted by a long running older transaction which may create starvation.
Besides that, the two phases in locking method also may cause to dirty read
and cascading abort. The problem of dirty read is arises when assume T1
and T2 are executed interleave. For this condition, the T1 will get the
exclusive locks for record. Supposedly, T1 release locks immediately after
doing updates while T2 acquires that locks and does its update. If the T1 fail
before it commits due to certain of reason, then it will return value to
original value. T2 will continue to its execution with uncommitted data and
this will cause an error on the system result. On the other hand, the problem
of cascading abort arises when if a transaction aborts. There may have some
other transaction already used data from an object that the aborted
transaction modified it and unlocked it. If this happen, any such transaction
will also have to be aborts.
 Timestamp ordering technique
Timestamp is a monotonically increasing variable indicating the age of an
operation or a transaction. The larger of the timestamp values indicate that
more recent event or operation. Timestamp ordering technique concurrency
mechanism were considered suitable for distributed database systems since
transaction to be rolled back can be determined locally at each site. It
involves a unique transaction timestamps in place of conventional locks.
This technique is based on the idea that an operation is allowed to proceed
only if all the conflicting operations of older transaction have already been
processed. For instance, when a transaction accesses an item the system will
check whether the transaction is older than the last one which is accesses the
item. If this is the case that transaction proceeds, otherwise ordering is
violated and the transaction is aborted. Besides that, the serializability of the
transaction is preserved and requires knowledge about which transaction are
younger than the others.
In the implementation of a distributed timing system, each site in the
distributed system contains a local clock or a counter. This clock assumed to
tick at least once between any two events. This event within the site is
totally ordered. If the total ordering of event at different sites, it need to
assigned a unique number to each site and the number is concatenated as
least significant bits to the current value of local block. Moreover, each
message contains the information about the local time of their site of origin
at which the message is sent. There are many concurrency control method on
timestamp ordering technique. One of the method is basic timestamp
ordering (BTO) which is for each data item x, the largest timestamp of any
write operation on data item x and the largest timestamp of any read
operation on data item x which are denoted by R_TS (x) and W_TS (x)
respectively.
The basic timestamp ordering technique is easy to distribute and the transaction to
be aborted will immediately be recognized when the operation are being
scheduled. This type of method will not deal with deadlock problem because the
locks are not used and operations are blocked. Hence, an atomic commitment
mechanism is necessary to provide the reliability.
IMPROVEMENT OF ISSUE CONCURRENCY CONTROL
1. Deadlock resolution
The preceding implementation of two-phase locking makes the transaction
to wait for unavailable locks. When the waiting is uncontrolled will cause a
deadlock. The situation of deadlock can be characterized by wait-for graph,
that graph indicates which transaction is waiting for which other transaction.
There are three type of general technique are available for deadlock
resolution that are deadlock detection, deadlock prevention and timeout
strategies.
 Deadlock Detection
Deadlock detection is a transaction wait for each other in an uncontrolled
manner and is only aborted if a deadlock occurs. It can be detected by
explicitly constructing a wait-for graph and searching it for cycles.
Victim which is one transaction on the cycle will abort when cycle is
found and thereby breaking the deadlock. There are a few victim
selection criteria will be consider. First is the Current Blocker, this
current blocker will pick the transaction that blocked the most recently.
Secondly, Random Blocker which is a process of picks a transaction at
random from the participants in the deadlock cycle. Third is the Min
Locks that is to pick a transaction that is holding the fewest locks. Fourth
is the Youngest which is picked the transaction with the most recent
initial startup time. Lastly is the Min Work and responsible to pick the
transaction that has consumed the least amount of physical resources
(CPU + I/O time) since it first began running.
In order to minimize the cost of restarting the victim, usually the victim
is based on the amount of resource that use by each of transaction on the
cycle. Each of the two-phase locking’s scheduler can be easily construct
the waits-for graph based on the waits-for relationships local to that
scheduler. But this is not efficient to characterize all deadlocks in the
distributed system. To increase the efficiency, more “global” wait-for
graphs have to combine with local waits-for graph. For the centralized
two-phase locking will not have this type of problem since there only
consist of one scheduler.
In order to construct global waits-for graph, there consist of two
techniques which are hierarchical and centralized deadlock detection.
For the hierarchical deadlock detection, the database sites are organized
into hierarchy with deadlock detector to each node of the hierarchy.
Deadlock divided into many sites. Deadlock involve a single site are
detected at that. Whereas deadlock involve more than two sites of the
same region detected by the regional deadlock detector.
On the other hand, one site is designated the deadlock detector in the
centralized deadlock detection. Every few minutes, each scheduler has to
send its local wait-for graph to the deadlock detector. Then the deadlock
detector combines the local graph into a system wide waits-for graph by
constructing the union of the local graph. Although both of the technique
mention above differ in detail but it involve periodic transmission of
local waits-for information to one or more deadlock detector sites.
 Deadlock Prevention
Deadlock prevention is a “cautious” scheme in which transaction is
restarted when the system is afraid that deadlock to be occurring. In the
process of deadlock prevention, the scheduler will test the requesting
transaction which is name (T1) and the transaction that currently owns
by the lock (T2) when a lock request is denied. When T1 and T2 pass the
test, T1 is permitted to wait for T2 as usual. Otherwise, one of the two is
aborted. There are a few prevention algorithms that are Wound-Wait,
Wait-Die, Immediate-Restart and Running Priority. In the Wait-Die
prevention algorithm, if a lock request from transaction T1 leads to a
conflict with another transaction T2, then T1 started time earlier than T2
and block the T1 otherwise will restarted T1. The deadlock prevention
will know as nonpreemptive if T1 is restarted. By this technique
deadlock are impossible since there is only one transaction can be
blocked by a younger transaction.
In the Wound-Wait algorithm, if T1 started running before T2 then
restarted T2 otherwise blocked T1. This type of algorithm is known as
preemptive which is older transaction run though the system by killing
any one that they conflict with and continues waiting onlt for older
conflicting transaction. The scheduler must ensure that T1 wait for T2 so
that deadlock cannot occur. There is a better approach can be assign
priorities to transaction and to test priorities to decide whether T1 can
wait for T2. If T1 has the lower priority than T2 then T1 will wait for T2.
When they have same priorities, T1 cannot wait for T2 or vice versa.
This test will prevents the deadlock to occur since every edge in the
waits-for graph T1 has a lower priority than T2. In addition, the
Immediate-Restart algorithm also will overcome the deadlock by simply
restart T1 since there is no transaction is ever blocked.
Besides that, preordering of resources is a type of deadlock avoidance
technique that used to avoid restarts altogether. It requires predeclaration
of locks which mean that each transaction obtains all its locks before
execution. The priority of transaction is the number of the highest
number lock on it and the data item are numbered and each transaction
request locks one at a time in numeric order. Although this techniques
can avoid the deadlock occur but it forces locks to be obtained
sequentially which is tends to increase response time.
 Timeout strategic
In the timeout strategic, a transaction whose lock request cannot be
granted is simply placed in the blocked queue. When the wait time
exceeds some threshold value then the transaction will restarted.
Timeout will restart transaction that involves in deadlocks in the
detection strategies whereas in prevention strategic it may also restarted
some transaction that are not involved in any cycle.
2. Strict 2 Phase Locking (S2 PL)
In order to prevent cascading problem and dirty read, Strict 2 Phase Locking
mechanism is apply. The executed transaction holds all its lock to the every
end until it is committed or aborts. For the dirty read, the executed
transaction does not release any of exclusive locks until it commits or aborts.
S2PL has important role in the two phase locking problems. In first time
transaction it need a lock on a data item, acquires it and all locks of a
transaction released together when the transaction terminates and has a few
resource wasting. Figure 4 below show the S2PL mechanism.
Figure 4: S2PL mechanism
3. Ordering by Serialization Number (OSN) method
There exists a new method for concurrency control in distributed system
which increases the level of concurrent execution of transaction and known
as ordering by serialization number (OSN). The deadlock is prevented by
this method since it works in the certifier mode and uses time interval
technique in conjunction. In order to provide serializability, it combines with
time interval technique with short term locks. Scheduler is distributed and
the standard transaction execution policy is assumed. The read and write
operations are issued continuously during transaction execution. The write
operations are performed on the private copies of the data that issue by the
transaction. The transaction is certified and its operation will applied to
database when a validation test is passed by the transaction at the end of
transaction. Otherwise, the transaction will restart if the test validation is not
passed.
To find the serialization number for a transaction, the largest serialization
number of certified transaction will read item x. The largest serialization
number of certified transaction which have written item x will record along
the data item x which is known as RSN(x) and WSN(x). There consist of
four types of short term locks in this OSN method that are R-lock, W-lock,
Certified Read lock, Certified Write lock. The R-lock and W-lock normally
used when read or write the data item. It responsible to protect the data item
forms two conflicting operations of concurrent transaction. Whereas
Certified Read lock (CR-lock) and Certified Write lock (CW-lock) are used
during the validation test while a transaction is searching for a valid time
interval. These lock are held until either the transaction’s operation are
applied to the database or the transaction is aborted.
The deadlock of short term locks is prevented by a few steps. Firstly, R-lock
or W-lock is obtained at any order. Since transaction do not require any
other lock before releasing a particular R-lock or W-lock deadlock does not
appear. Secondly, employing the preordering for deadlock avoidance can
prevent the deadlock of the CR-lock and CW-lock respectively. Although
this method requires the access set of transaction to known in advance but
this disadvantage eliminated in the OSN method. This is because CR-lock
and CW-lock requires by the time of certification. Besides that, if a
transaction requires both CR-lock and CW-lock on data item x the system
will grants them both when both of these locks are requires at once.

CONCLUSION
In this study, the performance of concurrency control protocol on top of a
comprehensively modeled high-speed network is evaluated. It found that there
exist of some issue within the concurrency protocol such as deadlock and
starvation in the two-phase locking. The weaknesses that occur in another
concurrency protocol that is timestamp ordering.
By the way, there are also discussing about the effective way to make
improvement of the concurrency control by using three types of deadlock
resolution to solving the deadlock problem. The deadlock resolution included
deadlock detection, deadlock prevention and timeout strategic. Furthermore, a
precise definition of deadlock in term of existence interval of an edge is given and
these help provide a better understanding of deadlock detection is distributed
system. Besides, this paper also discuss the complete specification of a secure
version of two phase locking protocol. The interaction between the protocol and
the standard failure recovery procedures were discuss and modification to the
commit and restart procedures were proposed.
REFERENCES
Abdou R. and Hany M. Harb, N (2004)” Two Phase Locking Concurrency
Control in Distributed Database with N-Tier architecture”, 2004.

Wu, S. and Pan, D. (2009) “FUTURE TREND ON CONCURRENCY CONTROL


IN DISTRIBUTED DATABASES SYSTEM DESIGN“.

Sang H. and Rasikan, D., (2014) “Design and Analysis of A Secure Two-Phase
Locking Protocol”.

Shapour, J, Fariborz M. and Mehdi A., (2007) “Improving Strict 2 Phase Locking
(S2PL) in Transactions Concurrency Control”.

Philip A. (2018) BERNSTEIN AND NATHAN GOODMAN, “Concurrency


Control in Distributed Database Systems”,

Rakesh A, MICHAEL J., MEMBER, I., AND LAWRENCE W., (2016) ” The
Performance of Alternative Strategies for Dealing with Deadlocks in
Database Management Systems”,

Ugur H. and Asuman D, (2010) “Concurrency Control in Distributed Databases


Through Time Intervals and Short-Term Locks”.

You might also like