You are on page 1of 10

To

Detecting and Resolving Distributed Locking Conflicts (Doc ID
Bottom
118219.1)

PURPOSE
-------

Oracle's Distributed Database Option provides the ability to perform
transactions spanning multiple networked databases.

As in standalone databases, transactions may block each other as they
try to exclusively access database table rows, causing Locking Conflicts.
In the former case, tools and scripts exist that can help the DBA
detect and resolve such conflicts.

In distributed databases, the same tools do not always help in the detection
of such conflicts, as they operate on locking information obtained only from
the database on which they are run. This information cannot be used to detect
distributed locking conflicts unless it is combined with similar information
obtained from all databases in the distributed environment.

Such distributed locking conflicts are automatically resolved by the RDBMS
by rolling back any blocked distributed transactions that have been waiting
for a time longer than DISTRIBUTED_LOCK_TIMEOUT seconds (default 60 sec).
Unfortunately there is no facility for allowing the blocked session to
continue
after arranging for the blocker to commit/rollback/kill its session.

There are situations where it would be useful for the DBA to be able to
detect whether a session is blocked waiting on a distributed transaction
and to find the session blocking it in the network of distributed databases.
The DBA can then resolve the conflict in the usual manner e.g. by arranging
for the blocker to commit or rollback or by killing the blocking session.

This Bulletin describes a method and provides a script for performing the
above actions. When presented with a session that is assumed blocked by a
distributed locking conflict, the DBA can run the provided script and
determine
the location and the identity of the session blocking the waiter. With this
information available it becomes simple for the DBA to resolve the locking
conflict by use of the same mechanisms used in standalone databases.

SCOPE & APPLICATION
-------------------

The method described in this Bulletin will be useful to DBAs managing
distributed databases as well as Support Analysts and Consultants aiding
such DBAs. It may also interest Developers of Distributed Applications
as a tool for debugging locking conflicts in their applications.

Unlike other lock detection scripts, tools and utilities it is not ready
to run right away but requires a minimal configuration and installation

EXAMPLES OF LOCKING CONFLICTS IN DISTRIBUTED TRANSACTIONS --------------------------------------------------------- In this section we will look at a number of examples where distributed locking conflicts arise. So. The DBA looks at Lock Manager output for database A .sql) do show the locking conflict.procedure. A) A simple case : A non-distributed transaction being blocked by a distributed transaction locking a row in its local database. Through a database link from B to A it locks a row in a table on database A. The examples take place between 2 databases A and B. In this case. Session BB starts a distributed transaction on database B. This means that a session BA is started for the distributed transaction on database A and owns the lock. which prepares the distributed environment for running the script. Session WA starts up on database A. BA and BB belong to the same distributed transaction DxB. In all 3 examples this session runs on database A. The consequences and ramifications of these locking conflict resolution options must be understood well by anyone performing them and they must be carried out with due dilligence. DxB BA<-----------BB _________ | _________ ( ) __V__ ( ) ( A )|_____| ( B ) (_________) row (_________) X | WA The user running WA calls the DBA and reports that his session is blocked waiting. the tools (Lock Manager and utllockt. The options for resolving the distributed locking conflicts detected by this method are the same as those for locking conflict resolution in a standalone database environment. A. In each case we will see what would be displayed by the tools for detecting locking conflicts in each of the standalone databases involved in the distributed transaction and it will be shown how the combination of this information allows us to detect the distributed locking conflict. In each example WA is the session that is blocked in the distributed locking conflict and is the user who calls the DBA for help. It tries to lock the same row on the table in database A as the distributed transaction DxB and is blocked because session BA owns the lock on the row. the provided code must be reviewed and tested before being put to use in a production environment. As with all scripts. WA is blocked by a session in the same database.

sql output for database B he would be able to see session WB being blocked by BB.sql output for database A but cannot see WA being blocked by another session. Session WA starts a distributed transaction DxA and through a database link from A to B tries to lock the same row as session BB. In this case the DBA can free WA's session by e. This means that a session WB is started on database B for DxA (i. In this case. (or runs utllockt. WA is blocked by a session in the remote database.g. Session BB starts a non-distributed transaction on database B and locks a row in a table on B. Killing BB will unblock WB and allow distributed transaction DxA started by WA to continue. In this case Lock Manager/utllockt. If the DBA were to look at Lock Manager/utllockt. WB tries to lock the row held by BB and is blocked. However he would not be able to see from this output that WA and WB belong to the same distributed transaction DxA and that to solve WA's blockage he would have to solve WB's blockage by e. C) Final example: A distributed transaction tries to lock a row in a remote database and is blocked by another distributed transaction holding a lock on that row.e.g. Summary: WA is waiting because WB is blocked by BB. The DBA looks at the Lock Manager/utllockt.sql) do not show WA involved in a locking conflict.sql are not adequate because the session WA that started the blocked transaction is not on the same database as the row that it wants to lock. In the following 2 examples B) and C) this is not the case: B) Next example: A distributed transaction tries to lock a row in a remote database and is blocked by a non-distributed transaction running on that database and holding a lock on the row. . In this case Lock Manager/utllockt. the tools (Lock Manager and utllockt.sql are adequate because the session WA that started the blocked transaction is on the same database A as the row held by the distributed transaction which has locked the row. killing session BB. B.sql on database A) and can see that WA is being blocked by BA. BB _________ | _________ ( ) __V__ ( ) ( A ) |_____|( B ) (_________) row (_________) X | WA----------->WB DxA The user running WA calls the DBA and reports that his session is blocked waiting. killing session BA which results in the distributed transaction DxB being rolled back. WA and WB belong to the blocked distributed transaction DxA).

Session WA starts a distributed transaction DxA2 on database A and tries to lock the same row on database B as transaction DxA1. unblock WB and allow distributed transaction DxA2 started by WA to continue. WA is blocked by a session in the remote database.sql output for database B he would be able to see session WB being blocked by BB. Both WA and WB belong to DxA2. This means that in effect WA is blocked by BA. If the DBA were to look at Lock Manager/utllockt. the tools (Lock Manager and utllockt.sql) do not show WA involved in a locking conflict. killing session BB or by killing session BA. Both BA and BB belong to DxA1. In order to do this a session BB is started on database B and this session owns the lock on the row.sql output on database A and see no locking conflict. WA is waiting for WB to get the row. However he would not be able to see from this output that WA and WB belong to the same distributed transaction DxA2 and that BA and BB belong to the distributed transaction DxA1 and that to solve WA's blockage he would have to solve WB's blockage by e. In this case. DISTRIBUTED LOCKING CONFLICT DETECTION: SESSION TREES ----------------------------------------------------- In the examples of the previous section we saw that a distributed transaction can be blocked if any of the sessions participating in it are blocked waiting for a lock on a row of a table in the database to which they are connected. This means that a session WB is started on database B for DxA1. DxA1 BA----------->BB _________ | _________ ( ) __V__ ( ) ( A ) |_____|( B ) (_________) row (_________) X | WA----------->WB DxA2 The user running session WA calls the DBA and reports the blocking. B. BA controls BB.g. Summary: WA is waiting because WB is blocked by BB which is controlled by BA. Killing BB or BA will rollback DxA1. Lock Manager/utllockt.sql are not adequate because the session WA that started the blocked transaction is not on the same database as the row that it wants to lock. WB tries to lock the row held by BB and is blocked. although both WA and BA are running on the same database ! As in case B). WB is blocked by BB. Session BA starts a distributed transaction DxA1 on database A and locks a row in a table on remote database B. So. The DBA looks at Lock Manager/utllockt. .

Together these two values uniquely identify the program that started the distributed transaction across the network and they have the useful property of appearing in the V$SESSION entry of all the sessions in the distributed transaction's Session Tree. The sessions that are created in participating databases for a distributed transaction together make up the Session Tree of the transaction. C and D. The dynamic view V$SESSION of the database on which session A runs will have V$SESSION. We look at the V$SESSION dynamic views of all the networked databases for sessions X where V$SESSION. A is the Central database on which the DBA will run the DXBLOCKERS function to check for locking. On each database a user DXB is created and used for monitoring . Since it is straightforward to detect whether a given session is currently blocked waiting for a row lock (using the same V$ dynamic views as used by Lock Manager and utllockt. V$SESSION.MACHINE contains the name of the client system on which the program that started the distributed transaction is running. V$SESSION. Then we can check each of these sessions to see if it is blocked. If one or more of these sessions are blocked then we know that the distributed transaction as a whole is also blocked.PROCESS contains the Process Identifier of the program that started the distributed transaction on that client system.SID = SID_X and V$SESSION = MACHINE_A and V$SESSION. V$SESSION.In order to detect the locking conflict for a distributed transaction we would therefore need to check all of the sessions that are started for the distributed transaction in all of the databases that could take part in distributed transactions in our network.PROCESS = PROCESS_A. FINDING THE SESSIONS THAT MAKE UP THE SESSION TREE OF A DISTRIBUTED TRANSACTION ------------------------------------------------------------------------------ - Assume that a distributed transaction is started by a session A with Oracle Session Identifier SID_A in one of the networked databases. All these sessions X that we find make up the Session Tree of the distributed transaction started by session A. B.MACHINE = MACHINE_A and V$SESSION.sql) the only requirement is to determine the sessions that make up a distributed transaction's session tree.PROCESS = PROCESS_A. Assume that the networked databases are A. A SCRIPT FOR DETECTING DISTRIBUTED LOCKING CONFLICTS ---------------------------------------------------- The following script contains a function DXBLOCKERS that implements the distributed locking conflict detection method described above. It will also have two other values.SID = SID_A.

B. In the script you will need to replace: passwordA-D : SYS passwords for each of the databases aliasA-D : SQL*Net/Net8 aliases from system A to each of the databases A-D : database links from database A to each of the databases Note: loopback database link from A to itself is named A@dxb You may of course add more databases if you need them. grant select on sys. grant select on sys.sql is to be run on the Central database A. grant connect to dxb identified by dxb. It uses dynamic SQL (DBMS_SQL) and works with any number of databases.beginning of script crdxb. C and D In order to create the database links aliases 'aliasA'. grant connect to dxb identified by dxb. connect dxb/dxb@aliasA. On databases B. 'C' and 'D' are database links to databases B.v_$session to dxb. 'aliasC' and 'aliasD' will need to exist on system A. On database A the user DXB owns the function DXBLOCKERS and a database link for each of the databases: 'A@aliasA' is a loopback database link from A to A 'B'. create database link A@dxb connect to dxb identified by dxb using 'aliasA'.sql replacing names as for A. C and D. grant select on sys. grant select on sys.v_$session to dxb. C and D. grant select on sys. user DXB owns no objects of its own. connect sys/passwordD@aliasD. The definition of function DXBLOCKERS itself requires no alterations.v_$lock to dxb. grant select on sys.v_$lock to dxb. grant connect.v_$lock to dxb. drop database link C. Simply repeat the commands at the beginning of crdxb.resource. drop database link D. 'aliasB'.the V$SESSION and V$LOCK dynamic views from the Central database. drop database link A@dxb.sql ----- connect sys/passwordA@aliasA. connect sys/passwordC@aliasC. grant connect to dxb identified by dxb. create database link B . grant select on sys.v_$session to dxb.create database link to dxb identified by dxb. drop database link B.v_$session to dxb. connect sys/passwordB@aliasB.v_$lock to dxb. grant select on sys. ----. The script crdxb.

wsid varchar2(20). 'A@dxb'). 'SELECT sid FROM v$lock@' || bdb || ' WHERE sid=:bsid AND type = ''DX'''. clipid varchar2(20). bsid. rsid number. pos number. cid4 := dbms_sql.put_line('not in a DX'). 'C'). db varchar2(40)). dbms_sql.bind_variable(cid4. 'D').close_cursor(cid4). machid varchar2(100). bsid). insert into dblinks values(2. create database link C connect to dxb identified by dxb using 'aliasC'. end if. insert into dblinks values(3. insert into dblinks values(4. dbname varchar2(40). dbms_sql.define_column(cid4. 1. . if dbms_sql.fetch_rows(cid4) > 0 then dbms_sql. isdx). 1. cursor cid1 is select dbid.column_value(cid4. blsid number.parse(cid4. isdx). cid2 integer. ----. drop table dblinks. dbms_sql. dblinks d where b. return -1. connect to dxb identified by dxb using 'aliasB'. bsid varchar2(20)). cid4 integer. create database link D connect to dxb identified by dxb using 'aliasD'. bsid='||bsid||' bdb='||bdb). end if. nblockers number.dbid = d. db from dblinks. ':bsid'. create table blockers(dbid number. db from blockers b.open_cursor. dummy := dbms_sql. cursor cid6 is select wsid.execute(cid4). dbms_sql. create table dblinks (dbid number. cid3 integer. dummy integer.v7). insert into dblinks values(1.dbid. commit. cid5 integer. isdx number := 0.put_line('entered. if isdx = 0 then dbms_output. 'B'). drop table blockers. bdb varchar2) return integer as err number. dbms_sql. begin dbms_output. dbid number.no need to replace anything after this point ----- create or replace function dxblockers(bsid number.

parse(cid3. dbms_output. rsid.db || ' B.dbid.put_line('process=' || clipid).v7). machid). . dbms_sql.v7). else exit. dbms_sql.dbid||' '||c1. dbms_sql. machid. end if. clipid).db || ' WHERE machine = :machid AND process = :clipid'.put_line('machine=' || machid). if dbms_sql. 'SELECT machine. 1.parse(cid2.open_cursor.bind_variable(cid5.close_cursor(cid3). 1. dummy := dbms_sql. dbms_sql.define_column(cid2. ':machid'. dbms_output.execute(cid2). dbms_sql. loop if dbms_sql.impossible').sid = :rsid AND ' || ' B. dbms_sql.parse(cid5.v7). dbms_sql.id1 = W. dbms_output.close_cursor(cid5). dbms_sql. dbms_sql. blsid).bind_variable(cid3. 1.fetch_rows(cid5) > 0 then dbms_sql. 2. dbms_output. 1. INSERT INTO blockers VALUES (c1.put_line('found blocker sid = '||blsid).define_column(cid3. dummy := dbms_sql. else exit.define_column(cid5. rsid). blsid). end if. end if. 'SELECT B. if clipid is null then dbms_output. 2.column_value(cid3. DELETE FROM blockers. dbms_sql. clipid. rsid).bind_variable_char(cid2.open_cursor.execute(cid5). end if. 20).open_cursor. return -1. 1. cid3 := dbms_sql. bsid). cid2 := dbms_sql.define_column(cid5. dummy := dbms_sql.put_line('trying database no '||c1. for c1 in cid1 loop dbms_output.lmode <> 0 AND W. if machid is null then dbms_output. ' || ' v$lock@' || c1.sid FROM v$lock@' || c1. dbms_sql. dbms_sql.column_value(cid2. 1. 20). rsid).id2 AND W. blsid). dbms_sql. dbms_sql. dbms_sql.fetch_rows(cid2) > 0 then dbms_sql.impossible'). dbms_sql.put_line('local sid for DX = ' || rsid).bind_variable_char(cid2.process '||machid||'.db || ' W WHERE B.column_value(cid5. end loop. loop if dbms_sql. ':clipid'.id2 = W.process FROM v$session@' || bdb || ' WHERE sid=:bsid'. clipid). ':bsid'.id1 ' || ' AND B. return -1.put_line('clipid is null . ':rsid'.fetch_rows(cid3) > 0 then dbms_sql.lmode = 0'. end loop.db|| ' for machine. end if. cid5 := dbms_sql. machid). dbms_sql.column_value(cid5.put_line('machid is null .execute(cid3). 'SELECT sid FROM v$session@' || c1.'||clipid).

for c6 in cid6 loop dbms_output.put_line('Session with SID ' || c6.db || ' is blocked by ' || c6. select count(*) into nblockers from blockers. / ----. dbms_output.end of script crdxb.'A@dxb'). dbms_output. end loop. For example. Parameters are the user's SID and the database link to the user's database from Central.close_cursor(cid2).sql ----- EXAMPLE OF CALLING DXBLOCKERS ----------------------------- A user calls the DBA and complains that his session is blocked.429416553 local sid for DX = 18 found blocker sid = 12 total blockers = 1 .last_error_position.wsid || ' on ' || c6.g. return nblockers.put_line('EXCEPTION: err='||err||' pos='||pos). variable nbl number. V$SESSION or Top Sessions. as recorded in table DBLINKS. pos := dbms_sql. end. exception when others then err := dbms_sql.put_line('is blocked as follows :').last_sql_function_code. if the DBA wants to see if the session with SID 15 on the Central database A is involved in a distributed locking conflict he would execute DXBLOCKERS as follows: set serveroutput on. Finally he calls DXBLOCKERS while connected as DXB on the Central database (where DXBLOCKERS is installed). bsid=15 bdb=A@dxb process=429416553 trying database no 1 A@dxb for machine.bsid). / entered.process ALEXPC.429416553 local sid for DX = 15 trying database no 2 B for machine. dbms_sql. end if.process ALEXPC. dbms_output. The DBA finds from the user his name and database to which he is connected. if nblockers > 0 then dbms_output. begin :nbl:=dxblockers(15. end loop. He then gets the user's SID from e.put_line('Distributed Transaction of session with SID ' || bsid || ' in database ' || bdb). end.put_line('total blockers = '||nblockers).

bsid=15 bdb=A@dxb not in a DX PL/SQL procedure successfully completed.1 : Detecting and Resolving Locking Conflicts Note 33453.1 : Referential Integrity and Locking . the output would be: entered. The DBA can now decide whether to kill this session or ask the complaining user to rollback and retry his transaction later. In this case the blocking session has SID 12 and is in database B.Distributed Transaction of session with SID 15 in database A@dxb is blocked as follows : Session with SID 18 on B is blocked by 12 PL/SQL procedure successfully completed. RELATED DOCUMENTS ----------------- Note 15476. If the user was not involved in a distributed locking conflict.