You are on page 1of 19

Artificial Intelligence

Requirements
Knowledge representation to store what it knows or hears Automated reasoning to use the stored information to answer questions and to draw new conclusions Machine learning to adapt to new circumstances and to detect and extrapolate patterns

Agents
An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators. We use the term percept to refer to the agents perceptual inputs at any given instance. An agent's percept sequence is the complete history of everything the agent has ever perceived. The mathematical agent function maps any given percept sequence to an action, and is implemented by an agent program. Example: email spam filter. Percepts: the textual content of individual email messages. (A more sophisticated program might also take images or other attachments as percepts.) Actions: send to the in box, delete, or ask for advice. Goals: remove spam while allowing valid email to be read. Environment: an email program.

A rational agent is one that does the right thing, that is every entry in the table for the agent function is filled out correctly. A performance measure embodies the criterion for success of an agent's behaviour. Autonomy is the extent to which an agent can act without relieving on prior knowledge from its designer. Definition or a rational agent:

"For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever built-in knowledge the agent has"

In general, we will be interested in success over the long term.

For example, we might not want to favour a car-cleaner that's extremely fast in the rest hour and then sits around reading, over one that works consistently.

We are generally interested in expected performance because usually agents are not omniscient. they don't infallibly know the outcome of their actions.

As a general rule it is better to design performance measures according to what one actually wants in the environment, rather than according to how one thinks the agent should behave.

In designing an agent we design the task environment which is defined by PEAS (Performance, Environment, Actuators, Sensors)

Environments An environment is fully observable if if an agents sensor give it access to the complete state of the environment of each point in time. If the next state of the environment is completely determined by the current state and the action executed by the agent, then we say the environment is deterministic. In an episodic task environment, the agent's experience is divided into atomic episodes. Static if the environment doesn't change whilst deliberating. Discrete environment (time,actions) eg chess as opposed to taxi driving

Single environment eg crossword as opposed to chess.

Keeping track of the environment: action agent (percept) { static state; static rules; state = update(state,percept); rule = match(state,rules); next_action = find_action(rule); state = update(state,action); return next_action; }

Simple reflex agents are the simplest kind of agents. These agents act on the current percept, as opposed to the rest of the percept history. Eg; if car-in-frontis-breaking then initiate-breaking. These kind of agents only work if the environment is fully observable. Model-based reflex agents maintain internal state to track aspects of the world that are not evident on the current percept.

Goal based agents act to achieve their goals Utility based agents -they try to maximize their maximum expected happiness Learning agents -as Turing suggested, this is the easier way of creating complex AI. These agents have a learning element and a performance element (action performing) and the problem generator.

Solving Problems by Searching


There are methods that an agent can use to select actions in environments that are deterministic, observable, static and completely known. In such cases the agent can construct sequences of actions that achieve its goals; this process is called search. Before an agent can start searching for solutions, it must formulate a goal and then use the goal to formulate a problem. A problem consists of four parts: the initial state, a set of actions, a goal test function, and a path cost function. The environment of the problem is represented by a state space. A path through the state space from the initial state to a goal state is a solution. A single, general TREE-SEARCH algorithm can be used to solve any problem; specific variants of the algorithm embody different strategies. Search algorithms are judged on the basis of completeness, optimality, time complexity, and space complexity. Complexity depends on b, the branching factor in the state space, and d, the depth of the shallowest solution. Breadth-first search selects the shallowest unexpanded node in the search tree for expansion. It is complete, optimal for unit step costs, and has time and space complexity of O(bd). The space complexity makes it impractical in most cases. Uniform-cost search is similar to breadth-first search but expands the node with lowest path cost, g(n). It is complete and optimal if the cost of each step exceeds some positive bound e. Depth-first search selects the deepest unexpanded node in the search tree for expansion. It is neither complete nor optimal and has time complexity of O(bm) and space complexity of O(bm), where m is the maximum depth of any path in the state space. Depth-limited search imposes a fixed depth limit on a depth-first search.

Iterative deepening search calls depth-limited search with increasing limits until a goal is found. It is complete, optimal for unit step costs, and has time complexity of O(bd) and space complexity of O(bd). Bidirectional search can enormously reduce time complexity, but it is not always applicable and may require too much space.

When the state space is a graph rather than a tree, it can pay off to check for repeated states in the search tree. The GRAPH-SEAERCH algorithm eliminates all duplicate states. When the environment is partially observable, the agent can apply search algorithms in the space of belief states, or sets of possible states that the agent might be in. In some cases, a single solution sequence can be constructed; in most other cases, the agent needs a contingency plan, to handle unknown circumstances that may arise.

Notes from lectures: the total cost=path cost+search cost Uninformed or blind search is applicable when we only distinguish goal states from non-goal states. Methods are distinguished by the order in which nodes in the search tree are expanded. These methods include: breadth-first, uniform cost, depth-rst, depth-limited, iterative deepening, bidirectional. Informed or heuristic search is applied if we have some knowledge of the path cost or the number of steps between the current state and a goal. These methods include: best first, greedy, A*, iterative deepening A* (IDA*), SMA*.

Uniform-cost search differs in that it always expands the node with the lowest path-cost first

A heuristic function, usually denoted h(n) is one that estimates the cost of the best path from any node to a goal. If n is a goal then h(n)=0.

A* search combines the good points of: greedy searchby making use of h(n) uniform-cost search- by being optimal and complete.

It uses path cost g(n) and also the heuristic function h(n) by forming

f(n) = g(n) + h(n) where g(n) = cost of path to n and h(n) = estimated cost of best path from n So: f(n) is the estimated cost of a path through n.

Definition: an admissible heuristic f(n) is one that never overestimates the cost of the best path from n to a goal.

Iterative deepening search used depth-first search with a limit on depth that gradually increased. IDA * does the same thing with a limit on f cost. It is complete and optimal under the same conditions as A*. It only requires space proportional to the longest path. The time taken depends on the number of values h can take.

Informed Search and Exploration


This section examines the application of heuristics to reduce search costs. Optimality comes at a stiff price in terms of search cost, even with good heuristics. Best-first search is just GRAPH-SEARCH where the minimum-cost unexpanded nodes (According to some measure) are selected for expansion. Best-first algorithms typically use a heuristic function h(n) that estimates the cost of a solution from n. Greedy best-first search expands nodes with minimal h(n). It is not optimal, but is often efficient.

A* search expands nodes with minimal f(n) = g(n) + h(n). A* is complete and optimal, provided that we guarantee that h(n) is admissible (for TREESEARCH) or consistent (for GRAPH-SEARCH). The space complexity of A* is still prohibitive. The performance of heuristic search algorithms depends on the quality of the heuristic function. Good heuristics can sometimes be constructed by relaxing the problem definition, by pre computing solution costs for subproblems in a pattern database, or by learning from experience with the problem class. RBFS and SMA* are robust, optimal search algorithms that use limited amounts of memory; given enough time, they can solve problems that A* cannot solve because it runs out of memory. Local search methods such as hill climbing operate on complete-state formulations, keeping only a small number of nodes in memory. Several stochastic algorithms have been developed, including simulated annealing, which returns optimal solutions when given an appropriate cooling schedule. Many local search methods can also be used to solve problems in continues spaces. A genetic algorithm is a stochastic hill-climbing search in which a large population of states is maintained. New states are generated by mutation and by crossover, which combines pairs of states from the population. Exploration problems arise when the agent has no idea about the states and actions of its environment. For safely explorable environments, online search agents can build a map and find a goal if one exists. Updating heuristic estimates from experience provides an effective method to escape from local minima.

Constraint Satisfaction Problems


Constraint satisfaction problems consist of variables with constraints on them. Many important real-world problems can be described as CSPs. The structure of a CSP can be represented by its constraint graph. Backtracking search, a form of depth-first search, is commonly used for solving CPSs. The minimum remaining values and degree heuristics are domainindependent methods for deciding which variable to choose next in a backtracking search. The least-constraining-value heuristic helps in ordering the variable values.

By propagating the consequences of the partial assignments that it constructs, the backtracking algorithm can reduce greatly the branching factor of the problem. Forward checking is the simplest method for doing this. Arc consistency enforcement is a more powerful technique, but can be more expensive to run. Backtracking occurs when no legal assignment can be found for a variable. Conflict-directed backjumping backtracks directly to the source of the problem. Local search using the min-conflicts heuristic has been applied to constraint satisfaction problems with great success. The complexity of solving a CSP is strongly related to the structure of its constraint graph. Tree-structured problems can be solved in linear time. Cutset conditioning can reduce a general CSP to a tree-structured one and is very efficient if a small cutset can be found. Tree decomposition techniques transform the CSP into a tree of sub problems and are efficient if the tree width of the constraint graph is small.

Games
A game may be defined by the initial state (how the board is set up), the legal actions in each state, a terminal test (which says when the game is over), and a utility function that applies to terminal states. In two-player zero-sum games with perfect information, the minimax algorithm can select optimal moves using a depth-first enumeration of the game tree. The alpha-beta search algorithm computes the same optimal move as minimax, but achieves much greater efficiency by eliminating sub trees that are provably irrelevant. Usually, it is not feasible to consider the whole game tree (even with alphabeta), so we need to cut the search off at some point and apply an evaluation function that gives an estimate of the utility of a state. Games of chance can be handled by an extension to the minimax algorithm that evaluates a chance node by taking the average utility of all its children nodes, weighted by the probability of each child. Optimal play in games of imperfect information, such as bridge, requires reasoning about the current and future belief states of each player. A simple

approximation can be obtained by averaging the value of an action over each possible configuration of missing information. Programs can match or beat the best human players in checkers, Othello, and backgammon and are close behind in bridge, A program has beaten the world chess champion in one exhibition match. Programs remain at the amateur level in Go.

Lecture notes: CSPs standardise the manner in which states and goal tests are represented. As a result we can devise general purpose algorithms and heuristics. The form of the goal test can tell us about the structure of the problem. Consequently it is possible to introduce techniques for decomposing problems. We can also try to understand the relationship between the structure of a problem and the difficulty of solving it.

Clearly a CSP can be formulated as a search problem in the familiar sense: Initial state: --no variables are assigned. Successor function: assigns value(s) to currently unassigned variable(s) provided constraints are not violated. Goal: reached if all variables are assigned. Path cost: constant # per step. In addition: The tree is limited to depth so depth-first search is usable.

It is fairly easy to see that a CSP can be give as an incremental formulation as a standard search problem as follows:

Initial state: the empty assignment {}, in which all variables are unassigned.

Successor function: a vallue can be assigned to any unassigned variable, provided that it does nto conflict with previously assigned variables. Goal test: the current assignment is complete. Path cost: a constant cost (eg 1) for every step.

Every solution must be a complete assignment and therefore appears at depth n if there are n variables. Furthermore, the search tree extends only to depth n. For these reasons, depth-first search algorithms are popular for CSPs.

Planning
Planning systems are problem-solving algorithms that operate on explicit propositional (or first-order) representations of states and actions. These representations make possible the derivation of effective heuristics and the development of powerful and flexible algorithms for solving problems. The STRIPS language describes actions in terms of their preconditions and effects and describes the initial and goal states as conjunctions of positive literals. The ADL language replaces some of these constraints, allowing disjunction, negation and quantifiers. State0space search can operate in the forward direction (progression) or backward direction (regression). Effective heuristics can be derived by making a subgoal independence assumption and by various relaxations of the planning problem. Partial-order planning (POP) algorithms explore the space of plans without committing to a totally ordered sequence of actions. They work back from the goal, adding actions to the plan to achieve each subgoal. They are particularly effective on problems amenable to a divide-and-conquer approach. A planning graph can be constructed incrementally, starting form the intitial state. Each layer contains a superset o fall the literals or actions that could occur at that time step and encodes mutual exclusion, or mutex, relations among literals or actions that cannot co-occur. Planning graphs yield useful heuristics for state-space and partial-order planners and can be used directly in the GRAPHPLAN algorithm. The SATPLAN algorithm translates a planning problem into propositional axioms and applies a satisfiability algorithm to find a model that corresponds to a valid plan. Several different propositional representations have been developed, with carying degrees of compactness and efficiency.

Each of the major approaches to planning as its adherents, and there is as yet no consensus on which is best. Competition and cross-fertilization among the approaches have resulted in significant gains in efficiency for planning systems.

Learning from Observation


Learning takes many forms, depending on the nature of the performance element, the component to be improved, and the available feedback,. If the available feedback, either from a teacher or from the environment, provides the correct value for the examples, the learning problem is called supervised learning. The task, also called inductive learning, is then to learn a function from examples of its inputs and outputs. Learning a discrete-valued function is called classification; learning a continuous function is called regression. Inductive learning involves finding a consistent hypothesis that agrees with the examples. Ockham's razor suggests choosing the simplest consistent hypothesis. The difficulty of this task depends on the chosen representation. Decision trees can represent all Boolean functions. The information gain heuristic provides an efficient method for finding a simple, consistent decision tree. The performance of a learning algorithm is measured by the learning curve, which shows the prediction accuracy on the test set as a function of the training set size. Ensemble methods such as boosting often perform better than individual methods. Computation learning theory analyses the sample complexity and computational complexity of inductive learning. There is a trade-off between the expressiveness of the hypothesis language and the ease of learning.

Knowledge in Learning
The use of prior knowledge leads to a picture of cumulative learning. In which learning agents improve their learning ability by eliminating otherwise consistent ypotheses and by "filling in" the explanations of examples, thereby allowing for shorter hypotheses. These contributions often result in faster learning from fewer examples.

Understanding the different logical roles played by prior knowledge, as expressed by entailment constraints, help to define a variety of learning techniques. Explanation based learning (EBL) extracts general rules from single examples by exaplining the examples and generalizing the explanation. It provides a deductive method turning first-principles knowledge into useful, efficient, special-purpose expertise. Relevence-based learning (RBL) uses prior knowledge in the form of determinations to identify the relevant attrivutres, thereby generating a reduced hypohtesis space and speeding up learning. RBL also allows deductive gneerazlizations from single examples. Knowledge-based inductive learning (KBIL) finds inductive hypotheses that explain sets of observations with the help of background knowledge. Inductive logic programming (ILP) techniques perform KBIL on knowledge that is expressed in first-order logic. ILP methods can elarn relational knowledge htat is not expressible in attribute based systems. ILP can be done with a top-down approach or refining a very general rule or though a bottom-up approach of inverting the deductive process. ILP methods generate new predicates with which concise new theories can be expressed and show promise as general-purpose scientific theory formation systems.

Boolean CSPs include as special cases some NP-complete problems, such as 3SAT. In the worst case, therefore, we cannot expect to solve finite-domain CSPs in less than exponential time. In most practical applications, however, general-purpose CSP algorithms can solve problems orders of magnitude larger than those solvable via the general-purpose (non heuristic) searches described below. The complexity of solving a CSP is strongly related to the structure of its constraint graph. Tree-structured problems can be solved in linear time. Cutset conditioning can reduce a general CSP to a tree-structured one and is very efficient if a small cutset can be found. Tree decomposition techniques transform the CSP into a tree of sub problems and are efficient if the tree width of the constraint graph is small. Breadth-first search is complete and optimal, but has exponential cost both in terms of space and time. Depth first search is neither complete nor optimal, but

has exponential time complexity yet linear space complexity. Iterative deepening search is complete, optimal for unit step costs, and exponential time complexity and linear space complexity. Informed or heuristic search is applied if we have some knowledge of the path cost or the number of steps between the current state and a goal, and whilst more intelligent they can still fare poorly in terms of performance. Greedy search is not optimal nor complete, and has exponential time and space complexity - it can however, be very effective provided you have a good heuristic function. A* search combines the good points of greedy search (By making good use of h(n) ) and also of uniform-cost search (complete and optimal). Whilst being optimally efficient (ie no other optimal algorithm that works by constructing paths from the root can guarantee to examine fewer nodes) it still has exponential time and space complexity. IDA* has only linear space complexity but still has exponential time complexity.

The term backtracking search is used for a depth-first search that chooses values for one variable at a time and backtracks when a variable has no legal values left to assign. In pseudo-code:

function BACKTRACKING-SEARCH(csp) returns a solution, or failure return RECURSIVE-BACKTRACKING( {}, csp)

function RECURSIVE-BACKTRACKING(assignment, csp) returns a solution, or failure if assignment is complete then return assignment var <-- SELECT-UNASSIGNED-VARIABLE(VARIABLES[csp],assignment,csp) for each value in ORDER-DOMAIN-VALUES(var, assignment, csp) do if value is consistent with assignment according to CONSTRAINTS[csp] then add { var = value } to assignment

result <-- RECURSIVE-BACKTRACKING(assignment, csp) if result != failure then return result remove { var = value } from assignment return failure

This simple version of backtracking uses chronological backtracking, a better way would be to go back to one of the set of variables that caused the failure (with a conflict set).

By default, SELECT-UNASSIGNED-VARIABLE in my answer above simply selects the next unassigned variable in the order given by the list VARIABLE[csp]. This static variable ordering seldom results in the most efficient search. A better way to do it would be to assign the variable with the least number of possible "legal values" next. For example, at a certain stage there may be only one possible value that a variable could take- it would make sense to assign that value now and not have to worry about it later. This idea is called the minimum remaining values (MRV) heuristic. The degree heuristic attempts to reduce the branching factor on future choices by selecting the variable that is involved in the largest number of constraints on other unassigned variables. Once a variable has been selected, the algorithm must decide on the order in which to examine its values. The least-constraining-value heuristic can be effective in some cases; it prefers the value that rules out the fewest choices for the neighbouring variables in the constraint graph. In general, the heuristic is trying to leave the maximum flexibility for subsequent variable assignments.

An admissible heuristic h is a function that gives a guaranteed lower bound on the distance from any node u to the destination t. Natural example: straight-line distance (at maximum speed) to t.

h(n) is monotonic if f(n) = g(n)+h(n) never decreases along a path from the root. Almost all admissible heuristics are monotonic

h(n) is monotonic iff it obeys the triangle inequality

Or rephrased (from Course Text...) "A heuristic h(n) is monotonic if, for every node n and every successor n' or n generated by any action a, the estimated cost of reaching the goal from n is no greater than the step cost of getting to n' plus the estimated cost of reaching the goal from n': h(n) < c(n,a,n') + h(n) (This is a form of the general triangle inequality)."

Best-first search is an instance of the general TREE-SEARCH algorithm in which a node is selected fro expansion based on an evaluation function f(n) = estimated cost of the cheapest path from node n to a goal node. Greedy best first search tries to expand the node that is closest to the goal (ie f(n)=h(n) ). A* search however evaluates nodes by combining g(n), the cost to reach the node, and h(n), the estimated cost of the cheapest path from n to the goal: f(n) = g(n) + h(n) or in words f(b) = estimated cost of the cheapest solution through n.

A* is optimal if h(n) is an admissible heuristic.

Proof A* is optimal: Let Goalopt be an optimal goal state with f(Goalopt) = g(Goalopt) = fopt Let Goal2 be a suboptimal goal state with f(Goal2) = g(Goal2) = f2 > fopt

We need to demonstrate that the search can never select Goal2 (a suboptimal goal state)

Let n be a leaf node on an optimal path to Goalopt. So fopt > f(n)

because h is admissible and we're assuming it's also monotonic. Now say Goal2 is chosen for expansion before n. This means that f(n) > f2 so we've established that fopt > f2 = g(Goal2). But this means that Goalopt is not optimal. Contradiction!

A* search is also complete provided: The graph has finite branching factor;

There is a finite, positive constant c such that each operator has cost at least c.

The search expands nodes according to increasing f(n). So the only way it can fail to find a goal is if there are infinitely many nodes with f(n) < f(Goal).

There are two ways this can happen: There is a node with an infinite number of descendants There is a path with an infinite number of nodes but a finite path cost.

Given a game tree, the optimal strategy can be determined by examining the minimax value of each node, MINIMAX-VALUE(n). The minimax value of a node is the utility (for MAX, ie current player) of being in the corresponding state, assuming that both players play optimally from there to the end of the game. The minimax decision is the optimal choice for MAX because it leads to the successor with the highest minimax value. The minimax algorithm computes the minimax decision from the current state. It uses a simple recursive computation of the minimax values of each successor state, directly implementing the defining equations. The recursion proceeds all the way down to the leaves of the tree, and then the minimax values are backed up through the tree as the recursion unwinds.

The minimax algorithm performs a complete depth-first exploration of the game tree. If the maximum depth of the tree is m, and there are b legal moves at each point, then the time complexity of the minimax algorithm is O(bm). The space complexity is O(bm) for an algorithm that generates all successors at once, or O(m) for an algorithm that generates successors one at a time. For real games the time cost is impractical, so a number of time saving techniques are implemented. Whilst we can't eliminate the exponent from the computational complexity, we can effectively cut it in half with alpha-beta pruning. Consider a node n somewhere in the tree, such that Player has a choice of moving to that node. If Player has a better choice m either at the parent node of n or at any choice point further up, then n will never be reached in actual play. So once we have found out enough about n (by examining its descendants) to reach this conclusion, we can prune it. Alpha beta-pruning gets its name from the following two parameters that describe bounds on the backed-up values that appear anywhere along the path: = the value of the best (ie highest value) choice we have found so far at any choice point along the path for MAX. = the value of the best (ie lowest value) choice we have found so far at any choice point along the path for MIN. Alpha-beta search updates the values of and as its goes along and prunes the remaining branches at a node as soon as the value of the current node is known to be worse than the current or value for MAX or MIN, respectively.

Complexity theory: Lk-COL,LSAT,LISO,LHAM contained in LSpace. and computable in expopnential time Np: It may by hard to find a solution, but a possible solution is relatively small, once you have guessed one it is easy to verify whether it is actually a solution.

More precisely, the class NP P({0, 1}) (as "nondeterministic polynomial time", roughly "polynomial time using guessing") is defined as the set of all languages L {0, 1} (aka problems), such that there exists a "verification algorithm" V running in polynomial times, which takes as input a pair (x,w) 2 {0, 1}, where x is a "problem" and w is a possible "witness" for x 2 L, and outputs 0 or 1 (where in case V (x,w) = 1 it follows x 2 L), and there exists a polynomial p, such that for all x 2 {0, 1} we have x 2 L if and only if there is a w 2 {0, 1} with |w| p(|x|) and V (x,w) = 1. We have LISO,Lk-COL,LHAM 2 NP.v

You might also like