Course V231
Department of Computing
Imperial College
© Simon Colton
=
reneral notions of AI
± Characterisations, autonomous agents, search,
± Representations, game playing to demonstrate these notions
Automated reasoning
± Predicate logic, automating deduction,
± Resolution theorem proving, constraint solving
Machine learning
± Overview (FINDS), decision trees,
± Artificial Neural Networks, Inductive Logic Programming
¦volutionary approaches
± renetic algorithms, genetic programming

Knowledge:
± Definitions of rational, autonomous, agents
± What we have to worry about internally (agent)
¦nvironmental knowledge, utility functions, goals, etc.
Software engineering considerations
± What we have to worry about e ternally (environment)
Accessibility, determinism, episodes, etc.
Understanding
± Why we use agents as a concept in AI
± Why we worry about rationality and autonomy
Knowledge
± How to specify a problem as a @ problem
Initial state, operators, goal test
± What the general problems are in search applications
What are you looking for (path or artefact)?
Completeness/soundness, time/space tradeoffs, background
± Types of uninformed search
Depth first, breadth first, iterative deepening, bidirectional
± Difference between path cost and heuristic functions
± Types of heuristic search
Uniform path cost, greedy, A*, IDA*
Admissible heuristics, comparing heuristics with effective branching
± Hill climbing searches
Local ma /min problems, random restart, simulated annealing
Understanding
± Agenda analogy and graph analogy
± Why we have to search for an answer to problems (+coursework)
± Why we need different types of search
Uninformed searches, heuristic searches
± Why completeness & soundness are important notions
± Why heuristics are needed, what they are in general
Abilities
± Specify a problem as a search problem
± Simulate a specific kind of search
¦.g., which nodes are e panded ne t (draw graph, etc.)
± Calculate effective branching rates, compare heuristics
Calculate search space sizes, calculate heuristic measures
ÿ
Knowledge
± reneral types of representation available
Logical, graphical, production rules, frames
± Logical representations available
Propositional, predicate, higher order, fuzzy, etc.
Understanding
± Why different representations are required
± Limitations of each representation
± ¦ pressiveness in logical representations
Abilities
± Represent information
Logically, as semantic networks, in a frame based way, etc.
Knowledge
± What a zerosum two player game is
± What the minima principle is in general, what cutoff search is
± ¦valuations functions, weighted linear functions
± Alphabeta pruning
± ¦ pectima , chance nodes
Understanding
± Why using minima strategies is rational
Abilities
± Write down entire search trees for simple games
± Propagate scores from the bottom to top of the trees
± Work out the ne t move for a player, including e pectima
ÿ
Knowledge
± Synta and semantics of first order predicate logic
Sentences, connectives, constants, predicates, variables,
functions, quantifiers, etc.
± What quantifiers mean, their scope, etc.
± Instantiation, ground variables, etc.
± Horn clauses, logic programs, body, head
± How search is undertaken in Prolog, LIPS, WAM
± ORparallelism, ANDparallelism
ÿ
Understanding
± Differences between propositional and predicate logic
Benefits to predicate logic (e pressivity)
± What the terms synta and semantics mean
± Prolog is a declarative, not procedural language
± How logicbased e pert systems work in general
Abilities
± Translate ¦nglish to first order predicate logic
± Translate first order predicate logic to ¦nglish
(Without making mistakes either way!)
± Simulate a Prologstyle search
± Identify parts of logic sentences and logic databases
e.g., quantifiers, constants, head, body, literal, Horn clauses
Knowledge
± That and, or are commutative, distributive
± Some commonly used propositional equivalence rules
Double negation, rules [¦1], [¦], [¦], de Morgans law
± Some commonly used implication rules
Unit resolution, and elimination, or introduction,
¦ istential elimination, etc.
± I tend to supply inference rules in e ams
Not simple/common ones
± Different ways of chaining together inference steps
Forward/backward chaining, proof by contradiction
Understanding
± What it means for two sentences to be logically equivalent
± What it means for a sentence to be false
± What it means for one sentence to entail another
± How rewrite rules can be used for proving equivalences
Abilities
± Use truth tables to
Show equivalences, tautologies, that a statement is false
That one statement implies another
± Apply inference rules
Show what¶s above and below the line
± Translate sentences:
Be fluent in rewriting sentences
G
Knowledge
± What conjunctive normal form is
± What a substitution is, what unification does
± Overview of the unification algorithm
± The resolution rule
Unit resolution, full resolution, generalised resolution
Understanding
± That resolution is refutationcomplete
± Why we need conjunctive normal form/unification
± Why the occurscheck is important in unification
G
Abilities
± Translate something into conjunctive normal form
By using equivalence rules
Organising quantifiers, standardising variables, etc.
¦ istential elimination
± Put a constant in place of an e istential variable
± Not using full skolemisation (we skipped over that)
± Prepare a set of sentences for use in a resolution proof
Needs all sentences as single clauses (just split them)
± Find a unifying set of substitutions
Apply them to unify two sentences
G
Knowledge
± Specifying a problem as a ioms and theorem
± As a search problem: operators, initial states (CNF), the goal test
± Dealing with equality (demodulation)
± Heuristic strategies:
Unit preference, set of support, input resolution, subsumption
± Overview of some other topics
Higher order proving, interactive, etc.
Understanding
± Why deriving the empty clause means a contradiction
± Why we negate the theorem statement
± Why proof by contradiction is valid
± Know that resolution has been applied to mathematics
G
Abilities:
± Prove a theorem using the resolution method
Remember to negate theorem statement
Follow proof all the way
Draw the proof tree
Or organise the resolution steps in a way
± Which makes me think you know what you¶re doing
± Deduce something from a set of a ioms
Not necessarily related to proving something
=
Knowledge
± What ML problem constituents are:
¦ amples (pos & neg), background information
What kind of errors can occur in the data
± What ML method constituents are:
Representation of solutions (v. important)
How to search for solutions, how to choose between solutions
± Occam¶s (Ockham¶s) razor: choose simplest if all else equal
± The FINDS method
Simple, guaranteed to find the most specific solutions
± How we assess hypotheses
False negatives, false positives
Predictive accuracy on training set, test set, comprehensibility
± How we assess learning methods: crossvalidation, holdback
Definition of overfitting
=
Understanding
± Learning from e amples
± How induction differs from deduction
± That positives and negatives are:
Correct and incorrect classifications
± What robustness means
± That hypothesis accuracy doesn¶t necessarily mean that the
method is a good one
That methods can overfit data (memorise)
Abilities
± Specify machine learning problems
± Identify problem constituents/method constituents
± Simulate search in the FINDS method
G
Knowledge
± What entropy and information gain are
± How the ID3 algorithm works
± How we can try to avoid overfitting trees
± What are good problems for decision tree approaches
Attributevalue pairs, discrete values, etc.
Understanding
± What the tree representation is
Why it is both a graphical and a logical representation
How it can be thought of as a categorisation problem
± What entropy is estimating
G
Abilities
± To read decision trees
± To construct decision trees from ¦nglish
specifications
± Specify a learning problem for decision tree learning
± Calculate entropy and information gain for attributes
± Simulate the ID3algorithm in action
Calculate information gain, choose attributes, restrict data
(Sv), how/when to end branches
G
!!
Knowledge
± How information is stored in an ANN (in the weights)
How weighted sums are calculated
± How ANNs are used to classify e amples
± What are input/hidden/output units/layers
± What a perceptron is
Threshold functions: step, sigma, linear
± Perceptron training rule
Using a learning rate
How to calculate weight changes
± Target and observed output values for output units
¦pochs over all the e amples
± What linearly seperable means, what boolean function means
G
!!
Understanding
± Difference between symbolic and nonsymbolic representations
± Motivation from biology (and the limitations of this)
± Why perceptrons are limited
± Why linear separability is required for perceptrons to learn something
± Why learning rate is usually set to a small value (undo previous)
Abilities
± Write down a perceptron to calculate a given function (e.g., boolean)
± Describe what a ANN would calculate
± Propagate values from left to right to make classifications
± Calculate weight changes in the perceptron learning rule
± Simulate the perceptron learning rule
"
!!
Knowledge
± Perceptron units in multilayer ANNs
± How sigmoid units calculate outputs from weighted sums
Formula for the sigma function
± How feed forward networks calculate values
How these values are turned into classifications
± Backpropagation Learning Routine
Overview of how this works (propback), epochs, weight changes, initial
random assignment of weights, etc.
± How to avoid local minima
Calculating network error (in overview)
Adding momentum
± How to avoid overfitting
Validation set, weight decay
"
!!
Understanding
± Why sigma being differentiable is important
± Which kinds of problems are suitable for ANNs
Long training, short e ecution, comprehension not a problem, etc.
± Why momentum works
Abilities
± Feed values forward to calculate outputs, use to classify e amples
¦.g., numerical functions, pi el data, etc.,
± Calculate output values for the sigma function
± Calculate error terms for the output units (given formula)
± Calculate error terms for the hidden units (given formula)
± Calculate weight changes using the error terms
± Simulate the backpropagation algorithm
Using learning rate, momemtum, etc.
Knowledge
± Problem conte t and specification
Logic programs (background, ¦+, ¦, hypothesis)
Prior satisfiability and necessity
Posterior satisfiability and sufficiency
± How we can invert resolution
To use induction (rules given) to find new sentences
Absorption, identification, etc.
What V and W operators are
± How ILP systems search for hypotheses
Specific to general (using induction) r to S (using deduction)
How pruning and sorting increase efficiency
How language restriction increase efficiency
Understanding
± Why logic programs are a good representation
¦asy to read, logical
± Why the problem specification is necessary
± Why we invert resolution for our operators
Prove that the observations follows from a ioms + H
Abilities
± Apply rules of inference such as absorption
Rules would be given
± Use resolution to demonstrate how the hypothesis proves the
observations (other sentences)
± Determine which are more general/specific sentences using
entailment
Knowledge
± What a formal representation of a CSP is
In terms of variables, domains, constraints (fully written out)
± What a binary CSP is
± What arc consistency is
How to make a problem specification arcconsistent
± By removing values from domains of variables
± What happens in backtracking search
± What forward checking is
± What variable and value ordering heuristics do
± What the fail first heuristic is
Understanding
± Why we write out CSPs formally
± Why we write out constraints as tuples which are allowed
More understanding
± Why binary CSPs are important
± Why forward checking works
± Why fail first is socalled, and why it works
± Why valueordering methods may be bad ideas
Abilities
± To specify a problem as a formal constraint satisfaction problem
(even if this is annoying for you!)
± Translate constraint formalisms to general constraints (e.g.,
using less than/greater than in linear arithmetic)
± Make problem specifications arcconsistent
± Simulate backtracking search (with forward checking)
Show understanding of failfirst etc. by doing hand calculations

Knowledge
± Of the evolutionary approach in overview
renerate populations, fitness functions, recombination, etc.
± Canonical genetic algorithm
Describe this schematically
How individuals are selected to mate (fitness function, number
which are guaranteed entry into intermediate population)
How pairs are chosen to mate, and chosen to produce
offspring
How offspring are produced through recombination

Understanding
± The inspiration from natural evolution (species/genes)
And its limitations
± That it¶s difficult to specify solution representations
± That it¶s difficult to specify fitness functions
± Why mutation is important (local ma ima avoidance)
Abilities
± Represent things (e.g., integers) as bit strings
± Perform crossover and mutation operators
One point and two point crossover
± Calculate evaluation functions
Use these in fitness functions for the intermediate population
Knowledge
± Representation of programs as graphs
What result producing branches are
± Specifying a problem for a rP approach
Solution space of programs, fitness function (evaluation function)
± What the terminal set, function set, control parameters and
termination conditions are
± How individuals are chosen for mating
Using fitness for probabilities for intermediate population
± Or using tournament selection, or ranking
± What the reproduction, crossover and mutation operators do
± What the crossover fragment is
Understanding
± That it is automatic programming
That this is hard, other programs do this in a limited way
± Why graphical representation of programs is good
± That architecture altering operations are used
On more sophisticated program spaces
± That rP can achieve humanlevel performance
Abilities
± Translate from functions (with things like ifstatements)
To graphs and back again
± Determine function and terminal sets from programs
± Perform crossover and mutation
± Simulate (describe) how a rP approach would proceed