Programming Reasoning and Uncertainty PDF

200 INTELLIGENCE AND DECISION SUPPORT SYSTEMS
To build artificial intelligence into the system, two primary topics need to be addressed:
how to program "reasoning" and what to do with uncertainty in the decision-making context.
These will be addressed in the next two sections.
PROGRAMMING REASONING
The reasoning process in humans is often automatic or implicit, and hence it is difficult to
see how it might be programmed in a set of deliberate steps for a computer. If, however,
we examine the reasoning process slowly and deliberately through its individual steps so
that we can see how the computer completes the reasoning process. Actually, reasoning by
both humans and computers must take one of two basic approaches. Either we begin with a
goal and try to prove that it is true with the facts we have available or we begin with all the
"known facts" and try to prove as much as we can. In computer terms, these are referred
to as backward reasoning and forward reasoning, respectively. The following examples
demonstrate deliberate examples of backward and forward reasoning and the manner in
which intelligence can be built into a DSS. Both examples will use the same information
so as to illustrate the differences in the processes.
Researchers are investigating prospective logic as a way to program morality into a computer.
Using prospective logic, programmers can model a moral dilemma so the computer can determine
the logical outcomes of all possible decisions and select the best (or least worst) one. This sets the
stage for computers that have "ethics,1' which could allow fully autonomous machines programmed
to make judgments based on a human moral foundation. Currently two researchers have developed
a system capable of working through the "trolley problem," an ethical dilemma proposed by British
philosopher Philippa Foot in the 1960s, In this dilemma, a runaway trolley is about to hit five
people tied to the track, but the subject can hit a switch that will send the trolley onto another track
where only one person is tied down. The prospective logic program can consider each possible
outcome based on different scenarios and demonstrate logically what the consequences of its
decisions might be.
Suppose there is a set of facts, known as facts A, B, C, D, E, F, G, and H. All these

facts are logical facts, and they can be set to either "true" or "false." In addition, there are
certain known relationships among the facts. These are listed below in the order in which
they might appear in the code:
Rl: > IF Fact E and Fact M and Fact G are all true, then
Fact F is true;
R2: > IF Fact K and Fact E are both true, then Fact D is true;
R 3 : > IF Fact N is true, then Fact Y is true;
R 4 : > IF Fact Y is true, then Fact H is true;
R 5 : > IF Fact B and Fact G are both true, then Fact M is true;
R 6 : > IF Fact K and Fact F are both true, then Fact Y is true;
R7: > IF Fact K is true, then Fact B is true.
PROGRAMMING REASONING 201
The ways in which these relationships are processed are quite different with backward and
forward chaining.
Backward-Chaining Reasoning
In backward chaining, we begin with a goal and attempt to prove it. For example, suppose
the goal is to prove that fact H is true. The system will process the relationships beginning
with the first one it encounters proving the goal (in this case, fact H) to be true:
Rl; > IF Fact E and Fact M and Fact G are all true, then
Fact F is true;
R2 : > IF Fact K and Fact E are both true, then Fact D is true;
R3 : > IF Fact N is true, then Fact Y is true?
■ ■ ^ IF F M ^ H H H B i t h e n Fact S H H H
R5: > IF Fact B and Fact G are both true, then Fact M is true;
R6: > IF Fact K and Fact F are both true, then Fact Y is true;
R7: > IF Fact K is true, then Fact B is true,
In order to prove relationship 4, it is necessary to prove that fact Y is true. Hence» proving
that fact Y is tme is now the goal of the system. It will again process rules:
Rl: >* IF Fact E a n d Fact M and Fact G a r e a l l true, then

Fact F is true;
R2 : > IF Fact K a n d Fact E a r e b o t h true, then Fact D is true;
R 3 : > IF Fact N is true, then Fact Y is true;
R4: > IF Fact Y is true, then Fact H is true;
R 5 : > IF Fact B and Fact G are both true, then Fact M is true;
To prove relationship 3, it is necessary to prove that fact N is true. We can see from the
seven relationships that there is nothing from which the system can infer whether fact N is
true. Hence, the system is forced either to use a default value (if one is specified) or ask
the user Suppose there is no default value given, and the user does not know whether fact
N is true. Under these circumstances, the system is unable to infer that fact N is true, so it
assumes nothing about the validity of fact N. However, it must locate another relationship
in order to infer fact Y is true:
Rl: > IF Fact E a n d Fact M and Fact G a r e a l l true, then

Fact F is true;
R2 : > IF Fact K a n d Fact E a r e b o t h true, then Fact D is true;
R3 : > IF Fact N is true, then Fact Y is true;
R 5 : > IF Fact B and Fact G a r e b o t h true, then Fact M is true;
R 6 : > IF Fact K and Fact F are both true, then Fact Y is true;
R7: >■ IF Fact K is true, then Fact B is true.
To prove relationship 6, it is necessary to prove that facts K and F are true. The system
begins with trying to prove fact K. As with fact N, mere are no relationships from which
one can infer that fact K is known. The system then must use a default value (if one is
specified) or ask the user. Suppose in this case the user knows that fact K is true, and hence
the system attempts to prove that fact F is true:
Rl >* IF Fact E and Fact M and Fact G are all trueH then
Fact F is true ;
R2 > IF Fact K and Fact E are both true, then Fact D is true;
R3 > IF Fact N is true, then Fact Y is true;
R4 ί > IF Fact Y is true, then Fact H is true;
R5 > IF Fact B and Fact G are both true, then Fact M is true ;
R6 : > IF Fact K and Fact F are both true, then Fact Y is true;
R7 : > IF Fact K is true, then Fact B is true.
As with fact N, there are no relationships from which one can infer the value of fact E
(whether or not it is true)- The system then must use a default value (if one is specified) or
ask the user. Suppose in this case the user knows that the value of fact E is known as true,
and hence the system attempts to prove that fact M is true:
Fact F is true;
R3: > IF Fact N is true, then Fact Y is true;
R5; >- IF Fact B and Fact G are both true, then Fact M is
R6 : > IF Fact K and Fact F are both true, then Fact Y is true;
R7 : >* IF Fact K is true, then Fact B is true,
The first step in that process is to establish that fact B is true:
Fact F is true;
R2 : > IF Fact K and Fact E are both true, then Fact D is true;
R3 : >■ IF Fact W is true, then Fact Y is true;
Relationship 7 states that fact B is true if fact K is true. Earlier, the system asked the user
and determined that fact K is true. At that time the value was stored» and hence the system
need not query the user again. Hence fact B is true, and the system can proceed to attempt
to determine whether fact G is true, As was true with fact N, there are no relationships from
which we can infer the value of fact G (whether or not it is true). The system then must use
a default value (if one is specified) or ask the user. Suppose in this case the user knows that
fact G is known. Hence, the system now establishes that fact M is true, since facts B and
G have been established as true. The system again returns to processing relationship 1 and
establishes that fact F is true:
Fact F is true;
With this information, the system return« to processing relationship 6 and establishes that
fact Y is true;
Fact F is true;
R3 > IF Fact N is true, then Fact Y is true?
K4 > lr Fact. Y is true, then Fact H is true;
R5 > IF Fact B and Fact G are both true, then Fact K is true;
R6 > IF Fact K and Fact F are both true, then Fact Y is true;
R7 > IF Fact K is trueH then Fact B is true,
Since fact Y is true, the system can establish that fact H is true through relationship 4;
Rl : > IF Fact E and Fact M and Fact G are all true, then
Fact F is true;
R3 > IF Fact W is true, then Fact Y is true;
R4 > IF Fact Y is true, then Fact H is true;
R5 > IF Fact B and Fact G are both true, then Fact M is true;
R7 > IF Fact K is true, then Fact B is true.
Since establishing that fact H is true is the goal of the system, it would stop processing at
this point and find no additional information. This process is illustrated in Figure 4S.1.
Forward-Chaining Reasoning
Consider, now, the path lhaL is followed using forward chaining. Using this system, we
begin with information and attempt to learn as much as possible. For example, suppose
we begin by knowing that facts K and E are both true. The system will look to prove
any relationship possible given these two facts and hence process relationships 2 and 7
(sequentially in the order in which they appear in the code):
Fact F is true;
R2; > IF Fact K and Fact E are both true, then Fact D is true;
^: > IF Fact B and Fact G are both true, then Fact M is true;
R7; > IF Fact K is true, then Fact B is true.
The environment changes as a result of this processing, and the system now knows that
facts D and B are also true. Hence, the system considers all relationships again to determine
whether more information can be gleaned. However, there are no additional relationships
thai can be processed. Unlike the case in backward chaining, the system does not begin to
Figure 4S.1. Hierarchy of logic—backward chaining.
prompt the user for information that might allow it to go further, and hence it would stop
and learn no additional facts.
Some software lets developers use hybrid approaches to programming by allowing
procedural programming, access programming, and/or object-oriented programming in
addition to forward- and/or backward-chaining pathways. Consider the forward-chaining
example above. Suppose the access programming code specified that users should be
queried, or a database should be searched, or a default value should be set if the status of
fact G is not known by this point of processing. If the user or database indicated fact G
were true, the system would again invoke the forward-chaining component and it would
process relationship 5:
Fact F is true;
R2: > IF Fact K and Fact E are both true, then Fact D is true?
R7 : > IF Fact K is trueH then Fact B is true.
The information regarding fact M would cause the system to evaluate all relationships
that require some or all of facts K, E, N, B, or M to be true, and hence it would process
relationship 1;
Fact F is true;
R3 > TF Fact N is true, then Fact Y is true;
R4 > TF Fact. Y is true, then Fact H is true;
The new information about fact F requires the system to reevaluate the relationships to
determine whether more information can be learned, and hence it will seek any relation-
ship that includes fact F and some subset of the other facts known at this time, as in
relationship 6:
Fact F is true;
The process proceeds in a similar fashion now that fact Y is known. Hence, the system wilt
process relationship 4:
Fact F is true;
R5 > IF Fact B and Fact G are both true, then Fact H is true;
Since none of the relationships indicate any new knowledge can be gained by knowing that
fact H is true, the system would stop with this knowledge. This process is illustrated in
Figure 4S.2.
Figure 4S.2. Hierarchy of logic—forward chaining.
Comparison of Reasoning Processes

In this example, the system "learned" the same ultimate fact (fact H is true) with backward
chaining and forward chaining only when forward chaining was supplemented by access
programming. However, the forward chaining with access programming and the pure
forward chaining process the relationships in quite different order. It is important to note
this for two reasons. First, the designer could find himself or herself with a dormant analysis
system unless information is sought in a particular manner. For example, suppose the last
example were done completely as a forward-chaining example (no access programming
interrupt). In this case, the system would quit processing after it learned that fact B was
true, and there would be no way to push it to do more. The system would not perform as
the designers had envisioned or as the decision makers need.
Second, we should be concerned about the way in which the system seeks information
from the user for the sake of sustaining the confidence of the decision maker (sometimes
referred to as "face" validity). Decision makers expect information to be sought in a
particular order. If there are vast deviations from such a logical order, then decision makers
may question the underlying logic of the system. If the logic can be defended, then such
questioning helps the decision maker to reason more effectively. On the other hand, if
decision makers cannot establish why such reasoning has occurred, they might choose to
drop the DSS.
UNCERTAINTY
Decisions are difficult to make because of uncertainty. Decision makers are uncertain about
how outside entities will change their environments and thus influence the success of their
choices. In addition, sometimes decision makers are uncertain about the reliability of the
information they use as the basis for their choices. Finally, decision makers are uncertain
about the validity of the relationships that they believe govern the choice situation.
UNCERTAINTY
Often decision makers also need to interact with "fuzzy logic." The term fuzzy logic
does not apply to a muddled thought process. Rather it means a method of addressing data
and relationships that are inexact. Humans address fuzzy logic regularly whenever they do
not treat decisions as totally "black-and-white" choices. The gradations of gray provide
flexibility in approaching problems that forces us to consider all possible options.
Consider, for example, whether a person is "tall." The term tall is a vague term that
means different things to different people. If in the choice process one selection procedure
required the machine to select only applicants who were tall, it would be difficult for the
DSS to do. Even in a sport such as basketball, where being tall really matters, the term
tall depends on the position one is playing. A particular individual might be tall if playing
guard but not if playing center because the requirements of the positions are so different.
Even if the discussion is limited to the position of guard, what is considered "tall enough"
is dependent upon other factors. In 1994 Mugsy Boggs, a basketball guard, was only 5 feet,
4 inches, which even I1 do not consider tall. However, because he had fabulous technique,
he was considered tall enough to play that position.
Similarly, when trying to select among employment opportunities, we might employ
fuzzy logic. There is not one opportunity that is "good" and another that is "bad." Generally,
they are all somewhat good on some dimensions and somewhat bad on other dimensions.
It is difficult for most people to define what dimensions are most important in a reliable
way, but they can tell which opportunities are better than others. This illustrates the historic
problem that humans could make better decisions than computers because they could
address uncertainty in their reasoning processes.
So, if DSS are to have "intelligence" that facilitates the choice processes, they must
also be able to address uncertainty from a variety of perspectives. There are two major
processes by which uncertainty is addressed in intelligent systems, with probability theory
and with certainty factors. These will be introduced separately.
Design Insights
The Turing Test
The ''standard interpretation" of the Turing Test, in which player C, the interrogator, is tasked with
trying to determine which player - A or B - is a computer and which is a human. The interrogator
is limited to only using the responses to written questions in order to make the determination.
The Turing Test image is from Wikimedia Commons. Thefileis licensed under the Creative
Commons Attribution Share Alike 3.0 License.
^hat which is considered tall also depends upon how tall an individual is. Since I fall into a category
generally referred to as "short," I have a more liberal definition of tall than do other people.
Representing Uncertainty with Probability Theory

Probability theory, which is the foundation of most of the statistical techniques used in
business applications, is based upon the belief that the likelihood that something could
happen is essentially the ratio of the number of successes to the number of possible trials.
So, for example, if we flip a coin 100 times, we expect 50 of those times to show "heads" and
hence we estimate the probability of heads as being \. Since few business situations are as
simple as flipping a coin, there are a variety of rules for combining probabilistic information
for complicated events. Furthermore, since we may update our estimates of probabilities
based upon seeing additional evidence, probabilists provide systematic methods for making
those changes in the estimates. This is referred to as Bayesian updating.
Consider the following example. Let us define three events, which we will call events
A, B, and C:
Event A: The act of being a good writer.

Event B: Receipt of an A in a writing course.
Event C: Receipt of an A in a systems analysis course.
Suppose:
P(A) = 0.5 P{A') = 0.5 P(A Π B) = 0.24 P(Ai)BnC) = 0.015

P(B) = 0.3 P{B') = 0.7 P(A DC) = 0.06
P(C) = 0.1 P(C) = 0.9 P(BDC) = 0.02
Without any new information, we believe the likelihood of being a good writer (event A) is
0.50. If, however, we know the person received an A in his or her writing class (event B),
we could update the probability the person is a good writer by applying Bayes' Rule:
P(A Π B) 0.24
P(A\B)
v = — = = 0.80
' ' P(B) 0.30
That is, given this new information, we now believe fairly strongly that the person is a good
writer.
If, instead, the probability of the intersection between events A and B (that is, the
probability that the person is both a good writer and received an A in a writing course)
were quite low, such as 0.01, the conditional probability P(A\B) would be reduced sub-
stantially from the initial estimate to a value of 0.033. That means we can update an initial
estimate after we get new information by either increasing or decreasing our certainty in
the likelihood of an event depending upon the new information provided.
A more generalized form of the equation is
Ρ(ΑΠΒ) Ρ(Β\Α)
v
P(A\B)= P(A Π B) + P{A' Π B) P(B\A)P(A) + ' P{B\A')P{A
' f
)
Suppose we now have the information that the person also received an A in his or
her systems analysis class. Based upon our earlier information, we could now update
UNCERTAINTY 209
the probability further:
P{A Π(ΒΠ O ) Ρ(ΑΓ)ΒΓ) C)

P(A\B Π C) = — = —
P(BnC) P(BC\C)
P(BnC\A)P(A)
~ P(B Π C\A)P(A) + P{B Π C\A')P(Af)
Hence, given all the information available, we believe the likelihood that the person is a
good writer is 0.75.
Updating the rules using a Bayesian approach is similar to this process.
Design insights
AI: A Space Odyssey
HAL 9000 is a fictional computer in Arthur C. Clarke's 2001: A Space Odyssey. The computer
was a powerful representation of artificial intelligence; HAL was programmed to insure the
success of the mission. It was capable of maintaining all systems on the voyage, of reasoning and
speech, facial recognition, and natural language processing, as well as lip reading, art appreciation
interpreting emotions, expressing emotions, reasoning, and chess. So, when the astronauts David
Bowman and Frank Poole consider disconnecting HAL's cognitive circuits when he appears to be
mistaken in reporting the presence ofa fault in the spacecraft's com muni cations antenna, HAL
gets nervous. Faced with the prospect of disconnection, HAL decides to kill the astronauts in
order to protect and continue its programmed directives. Its chilling line "I'm sorry Dave, but this
mission is just too important for me to allow you to jeopardize it1' made many nervous about the
future of artificial intelligence.
We are not at that point of the development of artificial intelligence yet. However, many
scientists believe that future advances could lead to problems. For example, medical systems can
already interact with patients to simulate empathy. Computer worms and viruses have learned to
vary their structure over time to avoid extermination. The concern is an "intelligence explosion"
in which smart machines would design even more intelligent machines that humans can neither
understand nor control. This is especially a concern if the tools reach the hands of criminals. At
a conference by the Association for Advancement of Artificial Intelligence, scientists discussed
the issues, the trends and how they could be controlled. There is as yet not agreement among the
researchers, and therefore no guidelines. But, it does give one pause for thought.
Representing Uncertainty with Certainty Factors

A popular alternative for addressing uncertainty is to use certainty factors. Instead of
measuring the likelihood as one function, we need to estimate a measure of "belief" separate
from a measure of "disbelief." New evidence could increase (decrease) our measure of
belief, increase (decrease) our measure of disbelief, or have some impact on our measure of
both belief and disbelief. Its effect is a function of whether the information is confirmatory,
disconfirmatory, or both confirmatory of one and disconfirmatory of the other. Consider the
example shown above. Suppose you believe the subject to "be a good writer." You know
the person waived his or her writing course. This information would cause you to increase
your measure of belief that the person was a good writer but would have no impact on your
measure of disbelief. However, if you knew that the person received a C in the writing class
and almost everyone waived the writing class, this would have two effects. First, it would
increase your disbelief that the person was a good writer because he or she received a grade
of C in a class that most people waived. In addition, it would decrease your belief that the
person was a good writer. Through this separation of measures of belief and disbelief, it is
possible to present evidence (facts or rules) and measure their impact more directly.
Certainty factors have a range between -1 and 1 and are defined by the difference
between measures of belief and measures of disbelief as shown below:
CF[A, e\. = MB [A, e] - MD[A, e]
where:
MB [A, e] = measure of increased belief in hypothesis A given evidence e
MD[A, e] = measure of increased disbelief in hypothesis A given evidence e
Increments associated with new evidence are made as follows:
1 if P(h) = 1
MB[A,e] = max(P(A|e), P(h)) - P(h)
otherwise
max(l, 0) — p(h)
1 if P(h) = 0
MD[h,e] = max(P(A|<?), P(h)) - P(h)
otherwise
min(l,0)-/?(A)
If P(h\e) > P(h), then there is increased confidence in the hypothesis. However, the
paradox that results is
CF(A,e) + CF(A',i>)^ 1
Hence, the confidence in a hypothesis is true given particular evidence and the confidence
the hypothesis is wrong given the evidence does not sum to 1 as it might in probability
theory.
Incrementally acquired evidence is used to update the measures of belief and measures
of disbelief separately:
0 ifMD(A,ei&6> 2 )= 1
MB[A,ei&e 2 ] =
MB(A, si) + MB(A, s2)[l - MB(A, sx)] otherwise
0 MB(h,el&e2)=l
MD[h,ei&e2] =
MD(A, si) + MD(A, s2)[l - MD(A, sx)] otherwise
Furthermore, measures of belief of conjunctions of hypotheses are determined by taking the

minimum value of the measures of belief of the individual hypotheses while measures of
disbelief of conjunctions are determined by taking the maximum value of the measures of
disbelief of the individual hypotheses. Further corrections are taken if there is uncertainty
regarding the certainty of a particular piece of information.

Programming Reasoning and Uncertainty PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Programming Reasoning and Uncertainty PDF

Uploaded by

Copyright:

Available Formats

200 INTELLIGENCE AND DECISION SUPPORT SYSTEMS

Suppose there is a set of facts, known as facts A, B, C, D, E, F, G, and H. All these

Rl: >* IF Fact E a n d Fact M and Fact G a r e a l l true, then

Rl: > IF Fact E a n d Fact M and Fact G a r e a l l true, then

The first step in that process is to establish that fact B is true:

Figure 4S.1. Hierarchy of logic—backward chaining.

Figure 4S.2. Hierarchy of logic—forward chaining.

Comparison of Reasoning Processes

Representing Uncertainty with Probability Theory

Event A: The act of being a good writer.

P(A) = 0.5 P{A') = 0.5 P(A Π B) = 0.24 P(Ai)BnC) = 0.015

the probability further:

P{A Π(ΒΠ O ) Ρ(ΑΓ)ΒΓ) C)

Representing Uncertainty with Certainty Factors

CF[A, e\. = MB [A, e] - MD[A, e]

Increments associated with new evidence are made as follows:

Furthermore, measures of belief of conjunctions of hypotheses are determined by taking the

You might also like