Ai-Ch 7

Uncertainty syll jabus ering ~ Acting under Uncertainty, Basi ‘ , Be : pference Using Full Joint Ditties ‘asic Probability Notation, The Axioms of Probability, . ily, Contents 71 Acting Under Uncertainty 72 Utity Theory 13 The Basic Probabily Notation 74 University Question with Answer Mintentee q@-t)Aha ret, alee : : roe oration aoa Rian tae’ negation) 16 enaled by ta oe : - ceive ay eet hn, rn wh pe : eeiy bel se A the Metts me pop ee enns : . ie one th ew eee Me Pe ened sf trad rey tee abe “re Uo coda Pay. aS a Na ene a emp Pe The i aio onal cle ae * cso neat Sly ngs ef a es cs: ai : clog thy] enc re pn ral ar fom is re oe ea ere eee ree te a coceman oa perc outcome 1 completely pei se, aang with the epee fare ed with the outcome. or eample : Consider a car diving agent who wants t0 reich at airport by & 3 psc tne say 0¢730 pom vee factors ike, whether agent arived at azporton time, what the length of ee wee vg daration atthe aiport ae atached with he ose. Jen oF consequent ned iy 2) Theoretical ignorance E «problem are not checked yt | BEY ty Theory sam cet amma yy end mg sn ft pes ee Ss dege fb i ey al . syuet Ponotey eageaa pe ; ; " Unity theory says that every ‘state has a degree of usefulness called ity. occa ut cae IR rtp ae a ge ty frction is called The uty ofthe state is elative to the agent for which nt basi of agent's preferences nce corresponds to an unequivocal bei gent sblty of 1 corresponds to an ont” pf te Sn maAni state i Diack ang Meh lack has won a game of chess is obviously igh for the agen Sao forthe agent paying white To measure that can count test or preferences. Someone loves deep, Contributing to the agent's own utility. Decision theory Preferences : bined with probabilities f ! as expressed by utilities are combined (or ational decisions. This theory, of rational decision making is called as decision they "8 Decision theory can be summarized 26, Decision theory = Probability theory + Utility theory. +. The principle of Maximum Expected Utility (MEU) = Decision theory says that the agent is rational if and only if it chooses the action iy, Yields highest expected utility, averaged over all the possible outcomes of the action, ‘= Design for a decision theoretic agent : Following algorithm sketches the structure of an agent that uses decision theory, select actions. The algorithm Funetion : DI-AGENT (percept) retums an action. Static : belief-state, probabilistic beliefs about the current state of the world, action, the agent's action, = Update belief-state based on action and percept ~ Calculate outcome probabilities for actions, tgiven actions descriptions and current beliestate ~ Select action with highest expected utility given probabilities of outcomes and utility information, The decision theoretic agent is identical, at an abstract level, to the I : l act level logical agent. The primary difeenes shat he dcion hore ages knoe ofthe cuent se ttn e agents belief state is a representation of the probabli possible actual states of the word. eG AAs time passes, the agent accumulates more evidence and he re evidence and its belief state changes Given the belief state, the agent can make probabilistic predictions of action outcoms and hence select the action with highest expected uty TECHNICAL PUBLICATIONS? - An up that for nowiodge A ae re Fer ty Playing fcecream and someone I ‘ "A utlity Funct in mca Se Ea katy faeton an 2 alouistic Behavior, simply by inuaing te were of ODT aS ne of (ht! le - ym 9 com pbabillit ‘ne Basle Probability Notation é probability MEOFY Wet Propositional logic language with additional jones gt" pblity UHEOTY USS represent prior probability statements, which apply before is obtained. The probability theory uses conditional probability statemen’s ‘evidence explicitly. propositions propositions (essertions) are atached withthe degree of belie. iplex proposition can be formed using standard logical connectives. for example (Cavity = Tru)» (Toothache = Fale} and (Cavity » Toothache) fon are same assertions. ttre random variable : Je basic element of language is random variable. rretors to a “part” of the world whose “status” inially unknown, yor eample : In toothache problem ‘aviy’ is a random vasible which can refer imy left wisdom tooth or right wisdom tooth random variables are like symbols in propositional logic. jandom variables are represented using capital letters. Whereas unknown random arable can be represented with lowercase letter. For example P (a) = 1- P a) Each random variable has a domain of values that it can take on. That is domain is set of allowable values for random variable. For example : The domain of cavity can be < true, false > ‘A random variable's proposition will assert that what value is drawn for the random variable from its domain. For example : Cavity = True is proposition. Saying that "there is cavity in my lower left wisdom tooth”. Random variables are divided into three kinds, depending on their domain. The types are as follows. 4) Boolean random variables : These are random variables that can take up only boolean values. For example : Cavity, it takes value either true or false. TECHNICAL PUBLICATIONS® - An up trust for know‘Mitel inttigonce ne ') Discrete random variables : They take values fom countabig They also include boolean domain. The values in the domain ‘utually exclusive and exhaustive (Ente) For example : Weather, sthas domain is representing four ations Froese = Sunny) = 07 P (Weather = Rain) = 0.2 (Weather = Cloudy) = 008 P (Weather = Cold) = 0.02 ‘The expression P(a) is said to be defining prior probability distribution for the random variable ' 1) To denote probabilities of all random variables combinations, the expression Play,ay) ean be used. This 5 called a8 jolnt probabilty distsbution for radon variables a1,23. Any number of random variables can be mentioned inthe expression. 8) A simple example of joint probability distribution i, “+ PeWeather, Cavity> can be represented as, 9x2 table of probabilities. (Weathers proba) avy probabil) robal wtion that covers the complete set of random aoe re os fll ant probeily dato 10) A simple example of full joint probability distrbution is, 1 problem world consists of 3 random variables, whesther, cavity, toothache ‘then full joint probability distribution would be, PeWeather, Cavity, Toothache> TeovmrGaL PUBLIGATIONS® Anup Bt Kowae 1, then prior probability 4 9= arable has p co an on new informa, know is b* Thats wha formation is known) hn {conditional probability. 1 an bility i conditioned on no eviews 4 unconditional probabilities. Te whenever Pt) > os The an also be wr This is called as product ral. true and w ter words it ays, for’ and 40 be te need ed at be true given Bb. It can be also waitlen a8 ano tt range cn alas then roby ot seaming Nfirin be mistake algal mpaton Ws wag ee ps enw ad wr dents prior Pobabty. Fx is kd nt a oy pense Sc 20) 207.6 someday rev when bs erase eran la fe erg When information i update lg! peters do not change ovet 5 1 Prati Asome cons ves serantc of pokaty samen. The bac sms (Kalngen’s oat en he pdbay sa nd ed ube are betwen O and Ft any propn 0 a sary tue (Le, val) propos Reve pub 2, nd ney abe 2 Neate) propositions have roby © ie) =1 Pe) =0 The probably of «sanction i gen by rav)= Pa) +P0)=Flant) ‘Tos om connects the probabites of lgialy whi popations This ule Ta a the caves where held, together with the cs whee Nas erty cover all the causes where ‘av alts Pe sare counts thee inerection twice 40 Wwe need To wblrct AP wes The. axioms eal only with pot prbsities ihe tan, condos, een ee Teccue paar pba ne eet eee a hy ———— Fuca PORLATIONS® 10 Brut roAtl ogres SS 1 wl be represented a, 4232 blo probable I 11) Prior probabitity for continuous random variable: i) For continuous random variable it is not feasible to represent Possible values because the values are infinite. For continuoys ya arable the probity is defined as 2 faction with parameter My indicates that random variable takes some value % me For : ble x denotes the tomorrow's example : Let random variable ten in Chennai. It would be represented 35, Pee POK=x) = URS - 371 0 This sentence express the belie that X 8 distributed uniformly be and 37 degrees celcius. wee 1) The probability distribution for continuous ran density function. Conditional Probability 1) When agent obtains evidence concerning previously unknown random variate the domain, then. prior probability are not used. Based on New inform conditional or posterior probabilities are calculated ‘The notation is Pfa|b) where a and b are any proposition. ‘The P's read as “the probability ofa given that all we know is is known it indicates probability of For example : P (Cavity Toothache) = 08 ft means that, if patent has toothache (and no other information is known) the the chances of having cavity are = 08 13) Prior probability are infact special case of conditional probability. Tt can be represented as P(a) which means that probability ‘a's conditioned on no evidence, Conditional probability can be defined interms of unconditional probabilities. The dem Val pa, 2 "That is when o) ‘equation would look like, (31) Pland) (a/b) = 1 it holds whenever Fb) > Pla/t) = PEED? ie holds whenever PO) > 0 ‘The above equation can also be written as, Pla nb) = Plalb) Plo) ‘This is called as product rule. In other words it * ‘ words it says, for'a’ and °' to be true we ‘eed to be true and we need a to be true given b. It can be also written a5, Pla.» b)= PObla) Pa) TECHWCAL PUBLICATIONS? Anup Bat br 7 ncartsty oon Loo at wotamanmiwae sonal probably ate used for probable inferencing i Cron can be sed fOr conditional dtouton. Ply) gives the vales Br wat) for each pote, Poly) gives the values of OC Mare the individual equations, oon naYey 2 PXexuI Yay PPO=y2) 2 y2) = PR=x1/¥=ya)P(V= y2) reso car DE combined into a single equation as, pox 7) = FORT) PCO nat probabites should not be trated as lp impicatons, That 1 Gin v' holds then probability of ‘ss something’, is a conditional probability to be mistake a8 logical implication. It is wrong on two points, one is, F(a) ange denotes rot probably. For hist donot rege any evidence. Secondly ain) = 07, is iumeditely relevant when bis avaable evidence. Ti will Keep Pe lFering. When information is updated logical Implications do not change over fine. BF The Probability Axioms psone gives semantic of probably statements, The basic ams (Kolmogorov starve to define the probability sale and its endpoints Al probabltes are Between 0 and 1 For any proposon a, 0 (a) <1. 2 Necesarly true (Le, valid) propositions have probably 1, and necessary false {le unsaisible) propositions have probability 0. Pleue) =1 Palse) = ; 4 The probability ofa disjunction is given by Pav) = PG) +P) - Pat) This axiom connects the probabilities of logically related propositions. This rule states that, the cases where ‘a! holds, together with the cases where 'b’ holds, Certainly cover all the causes where ‘av holds; but summing the two sets of, Cases counts their intersection twice, so we need to subtract F(a sb) Note : The axioms deal only with prior probabilities rather than conditional tables; this is because prior probability can be defined in terms of conditional Fob‘tic! ttzence a Sy Using the axioms of probabity: From basic probability axioms following facts can be ded PLV—3) = Pla) + Pia) Pla na) (by axiom 3 with B=) Pltrue) = Pla)+ Pia) Pals) by Jpical equivalence) 1 = Pa)+PEa) (by axiom 2) Pla) = 1 Pa) (by ageorad + Let the discrete variable D have the domain <1 Then, SPO=<))=1 5 ody “That is, any probability distribution on a single variable must sum to 1, 4 Its also true that any joint probability distribution on any set of variable ya, sum to 1. This can be seen simply by creating a single mepavarable yet domain is the cross product ofthe domains of the original variables, + Atomic events are mutually excusive, 50 the probability of any conjnetin y atomic events is zero, by axiom 2 «From axiom 3, we can derive the folowing simple relationship : The £1 2 propestion is equal to the sum of the probabilities of atomic events in wih hold thats Pa) = SSP) 083) aa Inference using Full Joint Distribution Probabilistic inference means, computation from observed evidence of posters probabilities, for query propositions. The knowledge base used for answering the qury fs represented as full joint distribution Consider simple example, consists of tine boolean variables, toothache, cavity, catch. The full joint distribution is 222, as show below. [Note thatthe probability in the joint distribution sum to 9. ‘© One perticular common task in inferencing is to extract the distribution over soa subset of variables or 2 single variable. This distribution over some variables single variables is called as marginal probability. For example : P(Cavity) = 0.108 + 0012 + 0.072 + 0.008 = 0.2 TECHNICAL PUBLICATIONS® Anup tat or knoe Fe9t crerarty acu weeny eae amis Pty (that 8 whose probably is being counted) are ea on TT marginalization fue 6 a8 flows, 4 eat of vara and Z pone EPC oan indicates that distribution Y can be obtained by summing out all the other Mes fom any jit dtrbstonX cosa ¥. vm of above example of general marginalization rule volved the conditional vara ees using product rule aay 2 SPP sis le fs conditioning rule or example : Computing probability ofa cavity, given evidence ofa toothache fa fllo¥3, o3a) P(Cavity - Toothache ~~ Pioothache) omatzation constant: (t6 varble tt remain constant forthe ditrbuson, + Nowy ensures that it adds into 1. wed to denote such constant rec ample + We can compute the probbty of « any, given evidence of # picavity| Toothache) = he, 38 (UWS # a y= FCavity hTotcte cavity |Toothach ‘P(Toothache) coer C1500 gg = (ios TOOI2+ 0016 +00eE jut to check we ean also compute the probably tat thre is no cavity given & stock # PG Cavity» Toothache) (Cavity/Tootache) = © Cavity Tooke) P (Toothache) . Oo1e+ 006 gg = DaOs+ 0012+ 0016+ COST Notice that in these two calculations the term 1/P (toothache) remains constant, no ster which value of cavity we calculate. With this notation we ean write above two eqatons in one. Cavity | Toothache) = = P(Cavity, Toothache) =a [F{Cavity, Toothache, Catch) + P(Cavity, Toothache, ~ Catch] =a [< 0.108, 0.016> + <0.012, 0.064>] TEGHNICAL PUBLICATIONS® » An up ust er KnownSe iene 4 O12 06> = 06,04 From above one can etrac a general inference prose cope vate. The etton age Consider the case in which query involve 2 % be the query variable (cai in the example Jt Ee th 18 Of evden eh (ast tothe inthe earple) tebe te covered is £01 BD, A at get remaicing unobserved vanable at ech in he example) The GOETY & POK/e) ag At be evaluated as ae PK le) =a PK) = a TPGey a5 where the summation is/over all possible ys (ve. all posible combinations of yg Of the urchuerved variables 'y). Notice that togetber the vanables, % E and Y erga’ the complete set of variables for the domain, 0 PO & 3) & SIOPIY & snes [Probabilities from the full joint distrization- Independance In is a elaterahip between tro diferent sts of fll jist ditions. 1 sd called as marginal of absolute independance of the variables. Independence indices that whether the to ful joint distributions affects probabisty ofeach other. ‘The independance between variables X and Y can be written as follows, POK|Y) = POX) oF POY|X) = PON or POX. ¥) = POO POO For example The weather is independant of ence dental problem. Which ca shown as below equation PCtocthache, Catch, Cavity, Weather) « PCToothache, Catch, Cavity) POWeather) + Following diagram shows factoring a lange foint distributing into sxate distrinstons, using absolute independence. Weather and dental problezs ax - ec) Fig. 72.4 Factoring a large joint diatbuting Into smaller distribution EEE] Bayes’ Rule Bayes’ rule is derived from the product rule. ore oe vs a Pe Plaid) PO) rer Peels) Fla) _ 037 part) ec eecin cmt oe ps sides of egustion 734) wed eget 737) and vig by sl A salbyP) poi? Pa) fs called as Bayes! rule or Bayes theorem cx Bayes’ lew. This rule 8 poxlY.e) POVle) Pri% =~ PERE) form of Bays’ rule with normalizaticn's gia = PLY) POP sgetring Bays’ Rule = ir repizes tial tree tems (L condtioral probability end 2 wncondiinal Probus), For computing one conditional probability. 0% Let, M be proposition, ‘patient has low sugas’ 5 be a proposition, ‘patient has high blood pressure Suppose we assume that, doctor knows folowing unconditional fat, 1) Prior probabilition of (m) = 1/3000. ) Prior probability of (5) = 1/20. Then we have, Plsim) = 05 Pim) = 1150000 Ps) = 120 TEDRSCAL PLBLCATIONS® A op trad outage TEGPIUAL PUBUCATIONS® An ip Bot hr kataArtinciat totetigenco Pim|s) = PslmvPn) Pos) 05%:150000 21/0 = 0.0002 That is, we can expect that 1 in 5000 with high BP. will has low sugar, 2) Combining evidence in Bayes’ rule. Bayes rule is helpful for answering queries conditioned on evidences, For example : Toothache and catch both evidences are available then cay) to exist. Which can be represented as, P(Cavity [Toothache » Catch) = c <0.108, 0016> = <0.871, 0.129> By using Bayes' rule to reformulate the problem (Cavity Toothache » Catch) = « P(Toothache » Catch | Cavity) P(Cavity) ~0ay For this reformulation to work, we need to know the conditional probabilities og conjunction Toothache » Catch for each value of Cavity. That might be feasible for ja two evidence variables, but again it will not scale up. If there are n possible evidence variable (rays, diet, oral hygiene, etc), then ther are 2" possible combinations of observed values for which we would need to kaoy conditional probabilities. The notion of independence can be used here. These variables are independex, however, given the presence or the absence of a cavity. Each is directly caused by te cavity, but neither has a direct effect on the other. Toothache depends on the state of ie nerves in the tooth, where as the probe's accuracy depends on the dentist's skill * which the toothache is irrelevant. Mathematically, this property is written as, PCToothache * Catch| Cavity) = P(Toothache|Cavity) P(Catch|Cavity) ~ (739) ‘This equation expresses the conditional independence of toothache and catchy gins Substitute equation (733) into (73.4) to obtain the probability of a cavity : P (Cavity | Toothache « Catch) = a P (Toothache| Cavity) P (Catch |Cavity) P (Casi) Now, the information requirement are the same as for inference using each piece evidence separately the prior probability P(Cavity) for the query variable and S# conditional probability of each effect, given its cause "YW ay PUSUCATIONS® «nop Beet fr nonin res 1 EE e Ze independence assertions can allow probabilistic systems to scale ups more cont ions) Truch more commonly avallable than absolute Independence sesertons woh i TN variables given that they are al onditerlly independent the size of aes Began Brows a8 Of) tend of OG) ee ea ity ample 18 which + singe cased infuenes # number of casi, {which are conditionally independent, given the cause. 06 int distebution can be writen a, Fe ect you Elect) » P(Cause) 11 PEtet, [Cae FCA ability distrbution is called as naive Bayes' model - “naive” because it sotto Pree > simplifying assumption) In caves where the “elect variables are not ote ae mr atonal ing ePerassitier. Sed as Bayes the proces of inference using full jit distriution with exemple heter section 73.9) : Dempster-Shafer theory theory is designed to deal with the distinction between vance than computing the probably of «proposition It computes the probability the Fetfece that support the proposition Define : Bayes theorem / . 3 Oe pablity theory and applications, Bayes theorem (aliematvely led * J i) ls nil pry see Palbro P= Pay This equation is called as Bayes Rule or Baye's Theorem. What is reazoning by default? ana: We con do quabitative reasoning using technique like default reason “pelieved to 2 cxrtain degree’, but a5 5. What are the logics sed i ans. There are two approaches ‘nformation in which logic is used.

Ai-Ch 7

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ai-Ch 7

Uploaded by

Copyright:

Available Formats

You might also like