Professional Documents
Culture Documents
Deductive Database
Deductive Database
A combination of a conventional database containing facts, a knowledge base containing rules, and an inference engine which allows the derivation of information implied by the facts and rules. Commonly, the knowledge base is expressed in a subset of first-order logic and either a SLDNF or Datalog inference engine is used.
-1-
Deductive Database
Rule
Head(involving output relation):-body(involving input relation [,output relation]) ([ ] means optional) Terminology: (:- denotes logical implication)
Body: The right side of :- is called the body of the rule Head: The left side of :- is called the head of the rule Output Relation:The relation defined by the rule Input Relation: The relation already exists before applying the rule Understanding/Interpretation If the tuples mentioned in the body exist in the database, then the tuples mentioned in the head of the rule must also be in the database. For short, if the body is true, then the head is true. We can view the Datalog rules as a function f which maps an instance of input relation to an instance of the output relation f: input instance->output instance Example 2: Components (Part, Subpart):-Assembly (Part, Components (Part, Subpart):-Assembly (Part, Subpart, Part2, Qty). Qty),
Components (Part2,
Subpart).
These two are rules in Datalog and recursively define a relation named Components. Understanding/Interpretation Rule 1 For all values for Part, Subpart, Qty, if there is a tuple <Part, Subpart, Qty> in Assembly, then there is tuple <Part, Subpart> in Components.
-2-
Deductive Database
Rule 2: For all values for Part, Subpart, Qty, if there is a tuple <Part, Part2, Qty> in Assembly, and a tuple <Part2, Subpart> in Components, then there is a tupe <Part, Subpart> in Components. Here Components is output relation and Assembly input relation Each rule can be used to infer or deduce some new tuples for the output relation, so we often call the database systems that support Datalog rules deductive database manage systems.
The meaning of a Datalog program is usually defied in two different ways, both of which essentially describe the relational instances of the output relations. Least model semantics: declarative; a way to understanding the program without thinking about how the program is excuted. Model: a collection of relational instances, one instance for each relation in the program, which satisfies the rules of the program Least Model: a model of a program M such that for any other model M2 of the same program, for each relation R in the program, the instance for R in M is contained in the instance for R in M2. Example 3:
<contents of example>
Least fixpoint semantics Fix point of function real-valued f (R->R): f(x)=x Sin(0)=0 cos(0.9998477415310881129598107686798)=
-3-
Deductive Database
0.9998477415310881129598107686798 Function from set to set f(set-> set): f(A->B):B={for any x in A, f(x)} double: multiply every element of the double{1,2,3}={2,4,6} input set by 2
double+: double union the input set double+{1,2,3}={1,2,3,4,6} double+{1,2,4,6,8}={1,2,4,6,8}=A double+{2,4,6,8..}={2,4,6,8...}=B double+{4,6,8..}={4,6,8.}=C All of A,B and C are fix point of function double+; you have seen fixpoint may not be unique; C is a subset of B which in turn is a subset of A. Least fixpoint of a function: a fixpoint that is smaller (subset of) than any other fixpoint of the function Does a function which has a fixpoint always have a least fixpoint? What is the least fixpoint for the function double+ Relation is just a set of tuples. Components=Title(Assembly.Subpart=Components.Part(Assembly
Components))
defined above?
Or Components=f(Components) -----------Recursive definition (Because the input relation Assembly is given) Compute the fixpoint: f(f(f(f(ff(a)))) =x=f(x) Sin(sin(sin(sin(1))) =0=sin(0)
Example 4: Compute fixpoint for Components=f(Components) We use Components_n to denote the instance of Components after n-th application of the function or say the Datalog rules.
-4-
Deductive Database
Step 1: initialize Components to get Components_0 Step 2: Apply first rule, we get Components_1 with [how many] new tuples added Step 3: Apply second rule, we get Components_2, with [how many] new tuples added Step 4: Apply second rule, we get Components_3, with [how many] new tuples added Step 5: Apply second rule, we get Components_4. Note: this time no new tuples generated, so we stop. Finally, Components_4 is a fixpoint; if we set Components_0=null, then Component is the least fixpoint.
-5-
Deductive Database
Fixpoint 1: Big={trike}; Small={frame, wheel, tire} Fixpoint 2: Big={}; Small={trike, frame, wheel, tire} Neither.
-6-
Deductive Database
The tables that do not depend on any other tables are in stratum 0 The tables that do not appears in lower strata and depend only on tables in stratum n or lower strata and depend negatively only on tables in lower strata are in stratum n+1. Stratified program: a program whose tables can be classified into strata according to the above algorithm. A stratified program is evaluated stratum by stratum, starting with stratum 0. The Big/Small program in Example [], is not stratified. How to change it into a stratified or say safe program? Just remove one not.
Result(y):-R(x,y), x=c Result(x):-R(x,y) Result(x,y,u,v):-R(x,y), S(u,v) Result(x,y):-R(x,y) Result(u,v):-S(u,v) Result(x,y):-R(x,y), not S(x,y)
8. Datalog with
Aggregation
-7-
Deductive Database
Datalog can be extended with SQL-style grouping and aggregation operations: SUM COUNT NumParts(Part, SUM(<Qty>)):-Assembly(Part, Subpart, Qty). SELECT Part, SUM(Qty) FROM Assembly GROUP BY Part
-8-
Deductive Database
Do we need to compute the parts of the same level with the subparts of spoke? Do we need to compute the parts of nodes which are on the path from spoke to the root trike?
Rewrite the program: Magic_SameLevel(P):- Magic_SameLevel(S), Assembly(P, S, Q). Magic_SameLevel(spoke):- . (the body is empty) SameLevel(S1, S2):- Assmebly(P1, S1, Q1), Assembly(P1, S2, Q2), Magic_SameLevel(S1). SameLevel(S1, S2):- Assembly(P1, S1, Q1), Assembly(P2, S2, Q2), SameLevel(P1, P2), Magic_SameLevel(S1).
-9-
Deductive Database
Q&A:
1. Consider the schema: Parent(parent, child) Which of following statements is not true? A. We can find all grandsons and granddaughters of some person with SQL-92, although it doesnt support recursive queries. B. We can not find all descendants a person has, with relational algebra, because it doesnt support recursive queries. C. Define a Datalog program: Cousin(x,y):- Parent(p, y), Parent(p, x). Cousin(, x):- Parent(p1,x), Patent(p2,y), Cousin(p1, p2). To find all siblings an cousins (not only cousin-german) John has, we can avoid repeated inferences by computing only tuples of Cousin whose first field is John. D. The Parent relation is not recursive defined but the Cousin relation defined in ( C ) is. E. Relational algebra and Datalog program are not equivalent although every relational algebra query can be expressed by range-restricted and stratified Datalog program.
------------------------------------------ END--------------------------------------------------
- 10 -