Overview of AI Techniques

There are various techniques that have evolved that can be applied to a variety of AI tasks - these will be the focus of this course. These techniques are concerned with how we represent, manipulate and reason with knowledge in order to solve problems. • • Knowledge Representation Search

Knowledge Representation
Knowledge representation is crucial. One of the clearest results of artificial intelligence research so far is that solving even apparently simple problems requires lots of knowledge. Really understanding a single sentence requires extensive knowledge both of language and of the context. For example, December 29th 2002 headline "It's President Kibaki'' can only be interpreted reasonably if you know it was the day after Kenya's general elections. Really understanding a visual scene similarly requires knowledge of the kinds of objects in the scene. Solving problems in a particular domain generally requires knowledge of the objects in the domain and knowledge of how to reason in that domain - both these types of knowledge must be represented. Knowledge must be represented efficiently, and in a meaningful way. Efficiency is important, as it would be impossible (or at least impractical) to explicitly represent every fact that you might ever need. There are just so many potentially useful facts, most of which you would never even think of. You have to be able to infer new facts from your existing knowledge, as and when needed, and capture general abstractions which represent general features of sets of objects in the world. Knowledge must be meaningfully represented so that we know how it relates back to the real world. A knowledge representation scheme provides a mapping from features of the world to a formal language. (The formal language will just capture certain aspects of the world, which we believe are important to our problem - we may of course miss out crucial aspects and so fail to really solve our problem, like ignoring friction in a mechanics problem). Anyway, when we manipulate that formal language using a computer we want to make sure that we still have meaningful expressions, which can be mapped back to the real world. This is what we mean when we talk about the semantics of representation languages.

Search
Another crucial general technique required when writing AI programs is search. Often there is no direct way to find a solution to some problem. However, you do know how to generate possibilities. For example, in solving a puzzle you might know all the possible moves, but not the sequence that would lead to a solution. When working out how to get somewhere you might know all the roads/buses/trains, just not the best route to get you to your destination quickly. Developing good ways to search through these possibilities for a good solution is therefore vital. Brute force techniques, where you generate and try out every possible solution may work, but are often very inefficient, as there are just too many possibilities to try. Heuristic techniques are often better, where you only try the options which you think (based on your current best guess) are most likely to lead to a good solution.

These are edited and compiled notes. Source;

http://www.cee.hw.ac.uk/~alison/ai3notes/chapter2_4.html by Alison Cawsey

AI notes on Knowledge Representation and Inference

1

we'd have 70 statements representing the fact that each student studies AI. In general. and what they mean. if we have a fact grey(elephant) we want to know whether it means all elephants are grey. from which we could prove anything at all! We could decide to use more complex logics which allow this kind of reasoning . However.if we just want a very restricted class of inferences. In fact. However. but the Nairobi Hilton is open. we may not want the full power of a logicbased theorem prover. using logic to represent things has problems. If we did this it would lead to a contradiction. one of the assumptions underlying work in Artificial Intelligence is that intelligent behaviour can be achieved through the manipulation of symbol structures (representing bits of knowledge). For our MSc. You can deduce these things from your general knowledge about the world. so we should be able to deduce it.some things should be left implicit. almost by definition. So. there are three main approaches to knowledge representation in AIΘ. as discussed above. • It should allow new knowledge to be inferred from a basic set of facts.maybe that Mutua should be attending a lecture at 5:30 pm on Tuesday Feb 18th. Most of these facts would never be used. In AI. using particular knowledge representation languages. However. These are high-level representation formalisms. For example in first order predicate logic we can't conclude that something is true one minute.Intro' to Knowledge Representation and Inference As mentioned in our lectures. it may not be very efficient . at present the fields are still largely separate. students are. and there is some promise of further cross-fertilization of ideas. We want to know what the allowable expressions are in the language. This turns out to be difficult to express in some languages. so we may choose to have a language where simple inferences can be made quickly. suppose you want to represent the fact that "Richard knows how old he is''. there is a tradeoff between inferential power (what we can infer) and inferential efficiency (how quickly we can infer it). representing some common-sense things in a logic can be very hard. The most important is arguably the use of logic. Structured objects and Production systems AI notes on Knowledge Representation and Inference 2 . though complex ones are not possible. Similarly. you probably wouldn't explicitly represent the fact that on Sundays the University is closed. For example. or what the results mean. the main question now is how we can represent knowledge as symbol structures and use that knowledge to intelligently solve problems. and is concerned with truth preserving inference. and have a well defined syntax and semantics. However. example. Some of these features may be present in recent non-AI representation languages. using general knowledge of problem solving and domain knowledge.there are all sorts of Θ Broad approaches to knowledge representation: Logic. For example. multi-user knowledge/data bases with well defined semantics and flexible representation and inference capabilities. Representing everything explicitly would be extremely wasteful of memory. to be deduced by the system as and when needed in problem solving. and we will only be discussing basic AI approaches here. the crucial thing about knowledge representation languages is that they should support inference. The next few parts will concentrate on how we represent knowledge. or what. We also would like to be able to make more complex inferences .in principle. computers provide the representational and reasoning powers whereby we might realistically expect to make progress towards automating intelligent behaviour. Sys. so he won't be able to have a lab session then. if we DO need to know if Mutua studies AI we want to be able to get at that information efficiently. student (say Mutua) we don't want to have to explicitly record the fact that Mutua is studying AI. has a well defined syntax and semantics. The remaining parts will concentrate more on how we solve problems. For example if we were representing facts about a particular MSc Info. these systems have been influenced by early AI research on knowledge representation. A logic. We can't represent explicitly everything that the system might ever need to know . Otherwise we won't be sure if our inferences are correct. for example. some particular one is grey. However. and can in principle be implemented using a whole range of programming languages. to allow robust. On the other hand. All MSc. a good knowledge representation language should have at least the following features: • It should allow you to express the knowledge you wish to represent in the language. we could develop a (very slow) intelligent machine made out of empty soda cans (plus something to move the soda cans around). • It should be clear. These symbols can be represented on any medium . On the one hand. Broadly speaking. and then later decide that it isn't true after all. such as deductive and object oriented databases.

In the next few parts we will describe these different knowledge representation languages in more detail.] We will discuss the following: • • Semantic Nets Frames Semantic Nets The simplest kind of structured object is the semantic net originally developed in the early 1960s to represent the meaning of English words. use of "*'') will vary across different texts. other facts should be added or deleted). and instance relations between particular objects and their parent class. We'll start with structured objects.. while the if-then rules typically state that if certain conditions hold (e. temporal logics and modal logics. The working memory represents the facts that are currently believed to hold. The idea of structured objects is to represent knowledge as a collection of objects and relations. and ICS611 students inherit typical attributes of 1st year MSc. such as has-part. another approach is to abandon the constraints that the use of a logic imposes and use a less clean. Production rules capture (relatively) procedural knowledge in a simple. The discussion of production rules should naturally exploit what you have already learnt in the topic: problem solving using search. So. as these are fairly easy to understand. where the nodes in the graph represent concepts. certain facts are in the working memory). but more flexible knowledge representation language. We'll use them so that "X subclass Y'' means that X is a subclass of Y. Mutua inherits all the typical attributes of ICS611 students. However. colour. and in introducing the basic ideas of class hierarchies and inheritance. Terms like "instance''. though the underlying ideas should be the same! Try to stick with the notation used here. (Some books/approaches use the relation is-a to refer to the subclass relation. to represent some knowledge about animals (as AI people so often do) we might have the following network: AI notes on Knowledge Representation and Inference 3 .g. Production systems consist of a set of if-then rules. any other relations are allowed. If the only action allowed is to add a fact to working memory then rules may be essentially logical implications. but don't view it as THE correct one. the most important relations being the subclass and instance relations. so that. They are important both historically. but generally greater flexibility is allowed.logics out there.g. by default. students are studying/attending this course now). Then we'll talk about logic. while the instance relation says that some individual belongs to some class. Structured Objects [Note: In this section the notation and terminology may not be the same as your textbooks. and then production rules. However. while the class of ICS611 students is a subclass of the class of 1st year MSc. The most important relations between concepts are subclass relations between classes and subclasses. students. and the arcs (or links) represent binary relationships between concepts. The subclass relation (as you might expect) says that one class is a subclass of another. is-a. and a working memory. such as default logics.. We can then define property inheritance. So Mutua is an instance of the class representing ICS611 students (not all 1st year MSc. modular manner. students. etc. "subclass'' and the particular representation of frames (e. Two such "languages'' are structured objects and production systems.g.. A semantic net is really just a graph. then some action should be taken (e. not that X has a subclass Y. We'll go into this in much more detail later.

Semantic nets are fine at representing relationships between two objects . early nets didn't have a very clear semantics (i. simple semantics. Things that are easy to represent in logic (such as "every dog in town has bitten the constable'') are hard to represent in nets (at least. but the disadvantage of a certain lack of flexibility . Mary. what sort of facts they were capable of representing).This network represents the fact that mammals and reptiles are animals. To summarize. while the subclass relation can be defined in terms of a subset relation .. in a way that has a clear and well-defined interpretation). so we won't go into them here. book2) where book2 represents the particular book we are talking about. that there exists some elephant that has a head. and having introducing a special ∀ (for all) relationship. We should be able to conclude that Clyde and Nellie both have a head. So. Depending on what interpretation you choose for your nodes and links. an elephant node denotes the set of all elephants. So the instance relationship can be defined in terms of set membership (Nellie is a member of the set of all elephants). They inherit information from their parent classes.maybe Clyde is pink! In the debate about semantic nets. When semantic networks became popular in the 1970s there was much discussion about what the nodes and relations really meant.e. then Clyde may have properties different from general elephant properties (such as being pink and not grey). If we interpret networks in this way we have the advantage of a clear. People were using them in subtly different ways. Subclass and instance relations allow us to use inheritance to infer new facts/relations from the explicitly represented one.but what if we want to represent a relation between three or more objects? Say we want to represent the fact that "John gives Mary the book'' This might be represented in logic as gives(John. However. people were also concerned about their representational adequacy (i. it wasn't clear what the nodes and AI notes on Knowledge Representation and Inference 4 . and that Nellie likes apples. Saying that elephants are grey means (in the simple model) that every individual in the set of elephants is grey (so Clyde can't be pink). The simplest way to interpret the class nodes is as denoting sets of objects. Semantic networks normally allow efficient inheritance-based inferences using special purpose algorithms. or (more reasonably in this case) that they all have some object belonging to the class head. that mammals have heads. and are large and grey. an elephant is a large grey mammal. The subclass and instance relations may be used to derive new information which is not explicitly represented. Techniques were developed to allow such things to be represented. which led to much confusion.e. Saying that an elephant has-part head could mean that an every elephant has some particular head. However. a node such as elephant might be used to represent the class of all elephants or just a typical elephant.the set of all elephants is a subset of the set of all mammals. nets allow us to simply represent knowledge about an object that can be expressed as binary relations. These techniques didn't really catch on. For example.. but can be found in many AI textbooks. different inferences are valid. For example. Nodes such as Clyde and Nellie denote individuals. if it's just a typical elephant. The subclass relations define a class hierarchy (in this case very simple). in semantic networks we have to view the fact as representing a set of binary relationships between a "giving'' event and some objects. which involved partitioning the net into sections. Clyde and Nellie are both elephants.

and where slots take single values. [Rich & Knight in fact distinguish between attribute values that are true of the class itself.links really meant). If objects/classes have several parent classes (e. and typical attribute values of members] In the above frame system we would be able to infer that Nellie was small. and that an elephant has_part trunk you may still want to infer that an elephant has a head.. at the very least. They are often used to capture knowledge about typical objects or events. such as grey.g. We have used a "*'' to indicate those attributes that are only true of a typical member of the class. or a typical restaurant meal. based on choosing the most specific parent class to inherit from. such as a typical bird. called a frame. For example. There are various mechanisms for making this choice. Frames Frames are a variant of nets that are one of the most popular ways of representing non-procedural knowledge in an expert system. grey and warm blooded. It is therefore useful to label slots according to whether they take single values or multiple values. Most frame systems will let you distinguish between typical attribute values and definite values that must be true. such as the number of members of that class. We could represent some knowledge about elephants in frames as follows: Mammal subclass: Animal warm_blooded: yes Elephant subclass: * colour: * size: Clyde instance: colour: owner: Nellie: instance: size: Mammal grey large Elephant pink Mutua Elephant small A particular frame (such as Elephant) has a number of attributes or slots such as colour and size where these slots may be filled with particular values. and still use them to represent what you wanted to represent. Techniques evolved to get round this. Clyde is both an elephant and a circus-animal). but circus animals are by default tame). and seem to partly remove the attractive simplicity of the initial idea. all the information relevant to a particular concept is stored in a single complex entity. Superficially. but they are quite complex. However frames. frames look pretty much like record data structures. In a frame. then you may have to decide which parent to inherit from (maybe elephants are by default wild. Objects and classes inherit the properties of their parent classes UNLESS they have an individual property value that conflicts with the inherited one. Clyde is large. pink and warm blooded and owned by Fred. support inheritance. If slots may take more than one value it is less clear whether to block inheritance when you have more specific information. if you know that a mammal has_part head. It was difficult to use nets in a fully consistent and meaningful manner. AI notes on Knowledge Representation and Inference 5 . Inheritance is simple where each object/class has a single parent class. and not necessarily every member.

It is a well understood formal language.. and put that in place of the slot's value.In general. giving the slot particular properties by writing a frame-based definition. As we saw for nets. both slots and slot values may themselves be frames. for example. Frame systems. The slot owner may take multiple values of type person (Clyde could have more than one owner). Predicate Logic The most important knowledge representation language is arguably predicate logic (or strictly. are pretty complex and sophisticated things. So. Maybe we have slots representing the length and width of an object and sometimes need to know the object's area .there are lots of other logics out there to distinguish between). interacting in an unpredictable fashion.g. and to derive new facts in a way that guarantees that. e. Most of the other features are developed to support inheritance reasoning in a flexible but principled manner. semantics and rules of inference. Allowing slots of be frames means that we can specify various attributes of a slot. or else our nice frame system would consist mainly of just lots of procedures. Here we will discuss the following: • • • • • • Review of Propositional Logic Predicate Logic: Syntax Predicate Logic: Semantics Proving Things in Predicate Logic Representing Things in Predicate Logic Logic and Frames AI notes on Knowledge Representation and Inference 6 . We could specify this in the frames: Size: instance: single_valued: range: Owner: instance: single_valued: range: Slot yes Size-set Slot no Person The attribute value Fred (and even large and grey etc) could be represented as a frame. to compute the value of that slot. first order predicate logic . if we don't know the value of a slot.: Fred: instance: occupation: Person Elephant-breeder One final useful feature of frame systems is the ability to attach procedures to slots. Such mechanisms of procedural attachment are useful. but know how it could be calculated. The main idea to get clear is the notion of inheritance and default values. More details are available in AI textbooks. if the initial facts were true then so are the conclusions. in all their full glory. it is easy to get confused about what slots and objects really mean. with well-defined syntax. Predicate logic allows us to represent fairly complex facts about the world. but perhaps should not be overused. we can attach a procedure to be used if needed. that the slot size always must take a single value of type size-set (where size-set is the set of all sizes).we would write a (simple) procedure to calculate it. In frame systems we partly get round this by distinguishing between default and definite values. and by allowing users to make slots `first class citizens'. We might want to say.

One useful inference rule is the following (called modus ponens) but many others are possible: a. then b is necessarily true. We can determine the truth value of arbitrary sentences using truth tables which define the truth values of sentences with logical connectives in terms of the truth values of their component sentences. to each proposition. In general.e. An interpretation function assigns. a truth value (i. So the following are valid sentences in the logic: P ∨¬ Q P ∧ (P →Q) (Q ∨¬ R) →P Propositions can be true or false in the world. We'd have to have lots of rules.. Sentences in predicate calculus are built up from atomic sentences (not to be confused with Prolog atoms). Predicate Logic: Syntax The trouble with propositional logic is that it is not possible to write general statements in it. and a is true. such as "Alison eats everything that she likes''. true or false). The semantics of the logic will define which inference rules are universally valid. X ∨ Y. As sentences can only be true or false. for every different thing that Alison liked. In proposition logic a fact such as "Alison likes waffles'' would be represented as a simple atomic proposition. ¬X. if X and Y are sentences in propositional logic. which should be familiar to you. truth tables are very simple. These arguments may be any term. We could prove that this rule is valid using truth tables. for example: X T T F F Y T F T F X∧Y T F F F In order to infer new facts in a logic we need to apply inference rules. and X ↔Y. The truth tables provide a simple semantics for expressions in propositional logic. So if we had the proposition Q representing the fact "Alison eats waffles'' we could have the facts: P ∨ Q : "Alison likes waffles or Alison eats waffles'' P ∧ Q : "Alison likes waffles and Alison eats waffles'' ¬Q: "Alison doesn't eat waffles'' P → Q : "If Alison likes waffles then Alison eats waffles''. Variable symbols AI notes on Knowledge Representation and Inference 7 . then so are X ∧ Y. Lets call it P. X → Y. a → b --b this rule just says that if a → b is true. Predicate logic makes such general statements possible. Terms may be: Constant symbols such as "Alison".Review of Propositional Logic Predicate logic is a development of propositional logic. This interpretation function says what is true in the world. Atomic sentences consist of a predicate name followed by a number of arguments. We can build up more complex expressions (sentences) by combining atomic propositions with the logical connectives and .

we can determine the truth value of any sentence in predicate calculus if we know the truth values of the basic components of that sentence.say Fred and Jim. in predicate calculus we have to deal with predicates. Richard) → likes(Alison. atomic sentences in predicate logic include the following: • • • friends(Alison. Formulae with all their variables quantified are also called closed formulae. A sentence should have all its variables quantified. This should all seem familiar from our description of Prolog syntax. ∀X father(Fred. ∃X S means that for some object X in the domain.Y)) i. If we have. an expression like "∀X loves(X. Richard) Sentences in predicate logic are constructed (much as in propositional logic) by combining atomic sentences with logical connectives. so the whole quantified expression will be false in this interpretation. Predicate Logic: Semantics The semantics of predicate logic is defined (as in propositional logic) in terms of the truth values of sentences.e. X) was true for each object.e. Function expressions consist of a functor followed by a number of arguments. Richard) ∨ likes(Alison. ∀X (person(X) → ∃Y loves(X. Waffles)) ∧¬ likes(Alison. would only be true if father(Fred. variables and quantifiers. So. Jim and Joe. However. Richard) likes(Alison. We can define the meaning of the predicate father in terms of all the pairs of objects for which the father relationship is true . Waffles) ((likes(Alison. so things get much more complex. Jim. Joe). So. Waffles)) → likes(Alison. though a well formed formula of predicate logic. Richard) Sentences can also be formed using quantifiers to indicate how any variables in the sentence are to be treated. is not a sentence. although Prolog is based on predicate logic the way we represent things is slightly different. In propositional logic we saw that this interpretation function was very simple. Function expressions such as "father(Alison)". S is true. which can be arbitrary terms. so the following are all sentences in predicate calculus: • • • friends(Alison. given our world (domain) of 3 objects (Fred. given some domain of objects that we are concerned with. so the following are valid sentences: • • ∃X bird(X) ∧¬ flies(X) i. So strictly. Richard) ∨ likes(Alison.. suppose we have a domain with just three objects in: Fred. father(Joe)) likes(X. a predicate P with 2 arguments. In our interpretation of the father relation this only holds for X=Jim. Like in propositional logic. AI notes on Knowledge Representation and Inference 8 . then the meaning of that predicate is defined in terms of a mapping from all possible pairs of objects in the domain to a truth value. just assigning truth values to propositions. Predicates are dealt with in the following way. However. there exists some bird that doesn't fly. S is true.. say. Y)''. ∀X S means that for every object X in the domain. The meaning of ∀and ∃ is defined again in terms of the set of objects in the domain. every person has something that they love. An interpretation function defines the basic meanings/truth values of the basic components. So. X). For consistency with Prolog we'll use capital letters to denote variables.such as "X". Richard) friends(father(Fred). so the two should not be confused. The two quantifiers in predicate logic are and .

∨ Bm ∨ C then we can deduce a new expression A1 ∨ A2 . and lots more... The efficiency of a proof will often depend as much on how you formulate your problem as on the general proof procedure used. In predicate logic we need to consider how to apply these rules if the expressions involved have variables.e. if we have a sentence ∀X A → B and a sentence C then if A and C can be matched or unified then we can apply modus ponens. ¬dog(X) ∨ animal(X)).. but take such a long time that it is just not usable.. The important thing is that everything is very precisely defined. This is all described in tedious detail in Rich & Knight. And of course. Other sound inference rules include modus tollens (if A → B is true and B is false then conclude ¬A). So.. Variants of resolution may be complete. This section will just give a list of logical expressions paired with English descriptions. but they do need to know how to represent things in predicate logic.. if the premises are true then the conclusions are guaranteed true. It may eventually prove something.you should try and work out for yourself how to represent the English expressions in Logic. but no proof procedure based on predicate logic is decidable. ∨ Bm. ∨ An ∨ B1 ∨ B2 .This only gives a flavour of how we can give a semantics to expressions in predicate logic. Resolution is a sound proof procedure for proving things by refutation . Proving Things in Predicate Logic To prove things in predicate calculus we need two things. However. When discussing propositional logic we noted that a much used inference rule was modus ponens: A. The details are best left to logicians. The most well known general proof procedure for predicate calculus is resolution. and what the Logic expressions mean in English. Formally we've already gone through what expressions mean. In resolution theorem proving. A → B --B This rule is a sound rule of inference for predicate logic. To do this we can use modus ponens.. but allow universally quantified sentences to be matched with other sentences (like in Prolog). but it may make more sense to give a whole bunch of examples. and what expressions in predicate logic mean. If we prove something using it we can be sure it is a valid conclusion. Given the semantics of the logic. there are many other things to worry about when looking at a proof procedure.. Basically.if you can derive a contradiction from ¬P then P must be true.g.. but it is still an important issue to bear in mind.we can't keep going back to the formal semantics when trying to draw a simple inference! Second we need to know a good proof procedure that will allow us to prove things with the inference rules in an efficient manner. This single inference rule can be applied in a systematic proof procedure. It may not be complete (i. ∨ An ∨ ¬ C and an expression B1 ∨ B2 ..4) "There is some table that doesn't have 4 legs'' AI notes on Knowledge Representation and Inference 9 . all statements in the logic are transformed into a normal form involving disjunctions of atomic expressions or negated atomic expressions (e. and-elimination (if A ∧ B is true then conclude both A is true and B is true). if we have an expression A1 v A2 . • ∃X table(X) ∧ ¬numberoflegs(X. we may not be able to always prove something is true even if it is true) or decidable (the procedure may never halt when trying to prove something that is false). For example we would like to be able to use the facts ∀X (man(X) → mortal(X)) and man(Socrates) and conclude mortal(Socrates). First we need to know what inference rules are valid . Representing Things in Predicate Logic Your average AI programmer/researcher may not need to know the details of predicate logic semantics or proof theory. This allows new expressions to be deduced using a single inference rule. so if use predicate logic we should know exactly where we are and what inferences are valid. it may just not be computationally efficient. Resolution is a sound proof procedure. pgs 143-160]. then some unpaired logical or English expressions .

and which uses special purpose inference procedures to perform class related deductions (such as inheritance of slot values from parent classes). we have to understand the semantics of the language to be able to represent things meaningfully in it. Once we have defined precisely what all the expressions and relations mean in terms of a well understood logic then we can make sure than any inferences that are drawn are sound. which have the expressive power needed to perform inheritence type inferences on simple properties of classes of objects (as in frame systems). while underneath we can be quietly confident that the inferences drawn by the system are all sound. AI notes on Knowledge Representation and Inference 10 . or a logic with slightly weaker representational power. Of course. which are only indirectly validated by a logical semantics. according to that logic. but which do not allow some of things (deductions etc) possible in predicate logic. which may (or may not) have a well-defined semantics. Or you can use something like a frame system.• • • • ∀X(macintosh(X) → ¬realcomputer(X)) "No macintosh is a real computer'' or "If something is a macintosh then its not a real computer'' ∀X glaswegian(X) → (supports(X. For example.Y) ∧ grey(Y).] Logic and Frames Representation languages such as frames often have their semantics defined in terms of predicate (or other) logics. such as a default logic. So.rangers) ∨ supports(X. new logics (called terminological logics) have been developed.table) "There is something small on the table'' Try • • • • • out the following: "All elephants are grey'' "Every apple is either green or yellow'' "There is some student who is intelligent'' ∀X red(X) ∧ on(X. rather than using the special inferences of a frame system. the gain in efficiency you get by reasoning with a restricted subset usually makes this tradeoff worthwhile. In fact. as it will mean there are some things we can't represent. but greater efficiency. but rather inefficient (and undecidable) inference and proof procedures.celtic)) "All Glaswegians support either Celtic or Rangers'' existX small(X) ∧ on(X. we could decide that if an object elephant has a definite slot colour with value grey then this means that: ∀X elephant(X) → ∃ colour(X. However. The following expression is wrong: ∀X:Xcarrots:orange(X). Another possible "advantage'' of this approach is that something like a frame system typically has restricted representational power compared with full predicate (or default) logic. (and similarly for any other definite slot) If we have slots that take default values then we will need a more powerful logic to represent their meaning. Terminological logics have more restricted expressive power than predicate logic. but which you can reason with efficiently.table) → small(X) ¬∃Xgrapes(X) ∧ tasty(x) [Note: When asked to translate English statements into predicate logic you should NOT use set expressions. This may sound like a disadvantage. These allow you to reason directly in the logic. Using a representation language with a logic-based semantics has the advantage that we can deal (on the surface) with a simple. natural representation language such as frames. you can choose between a logic with (fairly) great expressive power. but this may not be as awkward as dealing directly with the logic.

or some specified goal state is satisfied. IF (lecturing X) THEN DELETE (researching X) 11 AI notes on Knowledge Representation and Inference . known as conflict resolution strategies. and some books will use this term. This cycle will be repeated until either no rules fire. perhaps setting new subgoals to prove as you go. given the facts.they are sometimes called condition-action rules. a bunch of facts. In a forward chaining system you start with the initial facts. and keep using the rules to draw new conclusions (or take certain actions) given those facts. The system first checks to find all the rules whose conditions hold. and the cycle begins again. and when each might be useful. given the current state of working memory.Rule-Based Systems Instead of representing knowledge in a relatively declarative. The interpreter controls the application of the rules. and keep looking for rules that would allow you to conclude that hypothesis. IF (bad-mood X) THEN DELETE (happy X) 6. IF (month february) THEN ADD (marking-practicals alison) 4. thus controlling the system's activity. It then selects one and performs the actions in the action part of the rule. rulebased system represent knowledge in terms of a bunch of rules that tell you what you should do or what you could conclude in different situations. We'll look at both. static way (as a bunch of things that are true). IF (lecturing X) AND (marking-practicals X) THEN ADD (overworked X) 2. Rule-based systems vary greatly in their details and syntax. it is a non-intuitive term so we will avoid it. There are two broad kinds of rule system: forward chaining systems. The conditions are usually patterns that must match items in the working memory. given the working memory. Forward chaining systems are primarily data-driven.] Here we will look at: • • • Forward Chaining Systems Backward Chaining Systems Forwards vs Backwards Reasoning Uncertainty in Rules Forward Chaining Systems In a forward chaining system the facts in the system are represented in a working memory which is continually updated. while backward chaining systems are goaldriven. so the following examples are only illustrative. In a backward chaining system you start with some hypothesis (or goal) you are trying to prove. However. IF (month february) THEN ADD (lecturing alison) 3. and some interpreter controlling the application of the rules. and backward chaining systems. (The selection of a rule to fire is based on fixed strategies. It is based on a cycle of activity sometimes known as a recognise-act cycle. IF (overworked X) OR (slept-badly X) THEN ADD (bad-mood X) 5.) The actions will result in a new working memory. [Note: Previously the term production system was use to refer to rule-based systems. Rules in the system represent possible actions to take when specified conditions hold on items in the working memory . while the actions usually involve adding or deleting items from the working memory. First we'll look at a very simple set of rules: 1. A rule-based system consists of a bunch of IF-THEN rules.

In other representations variables may be indicated in different ways. Otherwise we will have little idea or control of what will happen. if we fire rule 7 and then later remove its preconditions. Let us say that rule 2 is chosen. This allows the system to follow through a single chain of reasoning. Rules 2 and 3 both apply. Of course. and (bad-mood Alison) is added to the working memory. ^person). However. Let us assume that initially we have a working memory with the following elements: (month February) (happy Alison) (researching Alison) Our system will first go through all the rules checking which ones apply given the current working memory. Special systems called truth maintenance systems have been developed to allow this.g. Fire rules with more specific preconditions before ones with more general preconditions. Finally. Anyway. (We don't want to keep on adding (lecturing Alison) to working memory). suppose we have the following further rule in the rule set: 7. and (researching Alison) will be removed from working memory. Lets say rule 3 is chosen and fires. Suppose rule 4 fires. ?person. AI notes on Knowledge Representation and Inference 12 . IF (happy X) THEN (gives-high-marks X) If this rule fires BEFORE (happy Alison) is removed from working memory then the system will conclude that I'll give high marks. then it would be nice if its conclusions could then be automatically removed from working memory. These include: • • • Don't fire a rule twice on the same data. such as by a ? or a ^ (e. we have a rule ``IF (bird X) THEN ADD (flies X)'' and another rule ``IF (bird X) AND (penguin X) THEN ADD (swims X)'' and a penguin called tweety.Here we use capital letters to indicate variables. A number of conflict resolution strategies are typically used to decide which rule to fire. Sometimes special working memory elements are used to help to control the behaviour of the system. using its conflict resolution strategies. if rule 5 fires first then rule 7 will no longer apply. Fire rules on more recent working memory elements before older ones. (lecturing Alison) is added to the working memory. then we would fire the second rule first and start to draw conclusions from the fact that tweety swims. so (marking-practicals Alison) is added to the working memory. with (happy Alison) removed from the working memory. If. rather than keeping on drawing new conclusions from old data. for example. with X bound to Alison. This time rule 3 and rule 6 have their preconditions satisfied. They should be carefully constructed. to leave: (bad-mood Alison) (overworked Alison) (marking-practicals Alison) (lecturing Alison) (month February) The order that rules fire may be crucial. And in the next cycle rule 5 is chosen and fires. These strategies may help in getting reasonable behaviour from a forward chaining system. For example. but the most important thing is how we write the rules.. So. (overworked Alison) is added to working memory which is now: (overworked Alison) (marking-practicals Alison) (lecturing Alison) (month February) (happy Alison) (researching Alison) Now rules 4 and 6 can apply. This allows us to deal with non-standard cases. which is now: (lecturing Alison) (month February) (happy Alison) (researching Alison) Now the cycle begins again. rule 6 will fire. so. so the system has to choose between them. (Systems which allow items to be deleted are known as nonmonotonic). with the preconditions specifying as precisely as possible when different rules should fire. especially when rules may result in items being deleted from working memory. On the third cycle rule 1 fires.

IF (slept-badly X) THEN (bad-mood X) 12. This is essentially what Prolog does. You COULD keep on forward chaining until no more rules apply or you have added your hypothesis to the working memory. Given a goal state to try and prove (e. setting these as new goals to prove. If you DO know what the conclusion might be. in practice we may choose to write the rules slightly differently if we are going to be using them for backward chaining. We could repeatedly fire rules. and the system will then try to prove any facts in the preconditions of the rule using the same procedure. IF (month february) THEN (weather cold) 13. then that goal succeeds. or have some specific hypothesis to test.g. If it doesn't the system will look for rules whose conclusions (previously referred to as actions) match the goal. removing the working memory element when that stage was complete. If it does. so it should be fairly familiar to you by now. IF (overworked X) THEN (bad-mood X) 11. IF (lecturing X) AND (marking-practicals X) THEN (overworked X) 8. But in the process the system is likely to do a lot of irrelevant work. In backward chaining we are concerned with matching the conclusion of a rule against some goal that we are trying to prove. This approach is most useful when you know all the initial facts. This can be done by backward chaining from the goal state (or on some hypothesized state that we are interested in). So the 'then' part of the rule is usually not expressed as an action to take (e. But maybe we had a whole batch of rules for drawing conclusions about what happens when I'm lecturing. IF (month february) THEN (lecturing alison) 9.. Backward Chaining Systems [Rich &Knight. updating the working memory. suppose we have the following rules: 7. adding these conclusions to a working memory. add/delete). adding uninteresting conclusions to working memory.we could have a special working memory element (stage 1) and add (stage 1) to the preconditions of all the relevant rules. or what happens in February . (bad-mood Alison)) the system will first check to see if the goal matches the initial facts given. In principle we can use the same set of rules for both forward and backward chaining. but as a state which will be true if the premises are true. One such rule will be chosen.we really don't care about this. IF (year 1993) THEN (economy bad) and initial facts: (month february) (year 1993) and we're trying to prove: (bad-mood alison) AI notes on Knowledge Representation and Inference 13 . checking each time whether (bad-mood Alison) is in the new working memory. IF (month february) THEN (marking-practicals alison) 10.we might decide that there are certain basic stages of processing in doing some task. so would rather only have to draw the conclusions that are relevant to the goal. but don't have much idea what the conclusion might be. suppose we are interested in whether Alison is in a bad mood. Instead it needs to keep track of what goals it needs to prove to prove its main hypothesis. 6. forward chaining systems may be inefficient. However..g.3] So far we have looked at how rule-based systems can be used to draw new conclusions from existing data. So. Note that a backward chaining system does NOT need to update a working memory. For example. and certain rules should only be fired at a given stage .

Rule 3 can be used. most of our rules have looked pretty much like logical implications. this doesn't tell us what to do when there are several rules which may be used to prove a goal. if you have some particular goal (to test some hypothesis). Of course. Usually we want to say things like "If Alison is tired then there's quite a good chance that she'll be in a bad mood''. We'll talk about this more in a later lecture. AI notes on Knowledge Representation and Inference 14 . when we have assumed that if the preconditions of a rule hold.6). determines how you decide on which options to try. As it isn't there. if you used forward chaining. Trying to prove the first goal. when solving your problem.it'll try one rule. If we were using Prolog to implement this kind of algorithm we might rely on its backtracking mechanism . aiming just for a good guess rather than precise probabilities. but are much less rigorous. and where. Forward chaining may be better if you have lots of things you want to prove (or if you just want to find out in general what new facts are true). if we use a programming language without a built in search procedure we need to decide explicitly what to do. and attach certainties to any new conclusions. However. lots of rules would be eligible to fire in any cycle. The particular method used for selecting items off the agenda determines the search strategy . where each item on the agenda represents one alternative path in the search for a solution. and you may have to try almost all of them before you find one that works. and we have proved the original goal (bad-mood Alison).there may be many possible ways of trying to prove something. when you have a small set of initial facts. we try matching it against the conclusions of the rules. However. We still have to prove (marking-practicals alison).it will try to prove (overworked Alison). and when there tend to be lots of different rules which allow you to draw the same conclusion. You should repeatedly pop a goal of the stack. and if that results in failure it will go back and try the other. then backward chaining will be much more efficient. then the conclusion will certainly hold. Rule 1 can be used. One way of implementing this basic mechanism is to use a stack of goals still to satisfy. and try and prove it. In fact. If its in the set of initial facts then its proved. This is in the set of initial facts. and the ideas of forward and backward reasoning also apply to logic-based approaches to knowledge representation and inference. Backward chaining may be better if you are trying to prove a single fact. it will match rule 2 and try to prove (month February).in other words. in what order. in practice you rarely conclude things with absolute certainty. The system should try `expanding' each item on the agenda. as you avoid drawing conclusions from irrelevant facts. If it matches a rule which has a set of preconditions then the goals in the precondition are pushed onto the stack. Let us assume that rule 4 is chosen first . Sometimes. To allow for this sort of reasoning in rulebased systems we often add certainty values to a rule. systematically trying all possibilities until it finds a solution (or fails to). Uncertainty in Rules So far. One good approach is to use an agenda. It matches rules 4 and 5. given a large set of initial facts. We'll go into this in much more detail in the section on search.First we check whether the goal state is in the initial facts. The approaches used are generally loosely based on probability theory. Of course. Forwards vs Backwards Reasoning Whether you use forward or backwards reasoning to solve a problem depends on the properties of your rule set and initial facts. sometimes backward chaining can be very wasteful . and the system will try to prove (lecturing Alison) and (marking practicals Alison). We might conclude that Alison is probably in a bad mood (maybe with certainty 0.

Separate problem solving procedures may then access and possibly update that knowledge. rule-based systems (especially forward chaining systems) are concerned more with procedural knowledge . semantics and proof theory. Structured objects are useful for representing declarative information about collections of related objects/concepts. Lots more issues are discussed in [Rich & Knight.Advantages and Disadvantages of KR Languages So far we have discussed three approaches to knowledge representation and inference: structured objects. certain things are important. and have a well-defined syntax. For this you could use a logic-based approach. AI programs typically have a distinct knowledge base. This is an important advantage compared with just writing a Pascal program which implicitly captures the knowledge. Whatever "language'' you represent the knowledge in. without rewriting the whole system. In all the approaches it should be possible to add new facts (and rules) in a simple. First. 4. capturing the rules and facts of the domain in question. References: [Rich & Knight] [Russel & Norvig] AI notes on Knowledge Representation and Inference 15 . Early approaches tended to have poorly specified semantics but there are now some practical systems with a clear underlying semantics. However. for example.you want to be able to write a few general purpose facts/rules. it should be represented at the right level of abstraction . Next. and in particular where there is a clear class hierarchy.3]. incremental fashion. or you could use a rulebased system. and rules. not a whole lot of very specific ones. especially once you get to more powerful logics. maybe using IFTHEN rules. should be specified sufficiently precisely so that they won't inappropriately fire when new facts are added. and where you want to use inheritance to infer the attributes of objects in subclasses from the attributes of objects in the parent class. saying what's true in the world. General purpose theorem provers may also be very inefficient. along with a theorem prover.what to do when. Logic-based approaches allow you to represent fairly complex things (involving quantification etc). to which new facts can be added as needed. Preconditions of rules. While logic is primarily used in a declarative way. Rule-based systems tend to allow only relatively simple representations of the underlying facts in the domain. Structured objects are no good if you want to draw a wide range of different sorts of inferences. but may be more flexible. logic. and often allow certainty values to be associated with rules and facts. no hacking is allowed in logic! If you can't represent something in your logic then that's too bad. it is helpful to write things in a way which allows new facts/rules to be added without radically changing the behaviour of the whole system.