Professional Documents
Culture Documents
----------------------------------------------------------------------------------
CREATE TRIGGER Tb
AFTER DELETE ON City
FOR EACH ROW
BEGIN
DELETE FROM PointOfInterest WHERE City=old.Name;
END;
----------------------------------------------------------------------------------
CREATE TRIGGER Tc
BEFORE INSERT ON DirectConnection
FOR EACH ROW
WHEN new.Distance > ANY 3 * ( SELECT C1.Distance+C2.Distance
FROM DirectConnection C1, DirectConnection C2
WHERE C1.FromPOI1=new.FromPOI1 and C2.ToPOI2=new.ToPOI2
and C1.ToPOI2=C2.FromPOI1)
RAISE_EXCEPTION(…);
----------------------------------------------------------------------------------
CREATE TRIGGER Td
AFTER DELETE ON PointOfInterest
FOR EACH ROW
WHEN NOT EXISTS (SELECT * FROM PointOfInterest WHERE City=old.City)
DELETE FROM City where Name=old.City;
----------------------------------------------------------------------------------
Triggering graph:
Ta Tb
Td Tc
The triggering graph is cyclic but there is no risk of nontermination, because all the actions involved in the cycle are
deletions, and – at worst – the database is emptied in a finite number of iterations.
B.
The two executions return different outcomes (one finds a deadlock, the other doesn’t). This sounds absurd, and
lets us doubt of the properties of the algorithm… but we must have faith in the Masters!
The reason for this oddity is that the initial conditions are inconsistent: node A invoked a sub-transaction on node
C, but node C is unaware that t1 is a sub-transaction invoked by Node A !! In simpler words, there is a missing
initial condition on node C: EA t1
And of course on corrupted data even the best algorithm may produce untrustworthy results.
C.
1. We just count the schedules having any read followed by a write on the same resource by the same transaction
count( for $s in //Schedule
where some $o1 in $s/Operation[@type= “r”],
$o2 in $s/Operation[@type= “w”]
satisfies ( $o1 << $o2 and $o1/Resource = $o2/Resource and $o1/TransactionId = $o2/TransactionId )
return <foo/> )
2. Equivalence holds if the schedules are permutations of one another (same operations) and have the same conflicts
N.B.: this formulation aims at maximizing readability and maintainability, not efficiency
D.
The peculiarity of this query is that any two POI in Lisbon must be joined with one another (a sort of Lisbon-self-join)
and checked against the connections to see if they are close enough.
Two main strategies are possible:
str1 : scan the connections, and if they are “short” (25%) then check if the connected POIs are both in Lisbon
str2 : build the “cartesian product” of the Lisbon POIs and lookup via the hash to check if they are connected an close
(a)
str1 is ineffective here as there is no means to lookup a POI without a full scan of the table:
12K + 25% ∙ 1.5M ∙ 2 ∙ 1.2K = BOOOOM !
A similar idea is to perform a sort of nested loop that for every block of Connection scans POIs and joins all:
12K + 12K ∙ 1.2K = 14.4 M (still very costly – I don’t even try to estimate the number of blocks of Connection with no arcs shorter than 1000, if any…)
str2 needs to scan POI many times… with a self-nested-loop… and (30K/30 ∙ (30K/30 - 1) ) = 9.900 lookups
1.2K + 1.2K ∙ 1.2K + (100 ∙ 99 ∙ 1 ) = 1.45 M
…unless we allow caching of 100 POI ids… it only takes one or two extra pages in main memory! :
1.2K + (100 ∙ 99 ∙ 1 ) = 11.1 K
(b)
str1 : The B+ helps to quickly extract (and possibly cache) the 100 POI ids without scanning the table, at a cost of:
2 (interm. nodes) + 3 (leaf nodes) + 100 (pointers) = 105
We could then scan Connection and immediately identify the wanted arcs: 12K + 105 = 12.1 K
str2 : As the B+ makes it faster to cache the 100 POI ids, we then perform the lookups based on the cached POIs:
2 (interm. nodes) + 3 (leaf nodes) + 100 (pointers) + 9.9K (lookups) = 10 K