You are on page 1of 79

Knowledge Acquisition and

Validation

1
Knowledge
g Acquisition
q and
Validation
Knowledge Engineering

2
Knowledge Engineering
„ Art of bringing the principles and tools of AI research
to bear on difficult applications problems requiring
experts' knowledge for their solutions
„ Technical issues of acquiring, representing and using
knowledge appropriately to construct and explain
lines-of-reasoning
„ Art of building complex computer programs that
represent and reason with knowledge of the world

3
„ Narrow perspective: knowledge engineering
deals with knowledge acquisition,
acquisition
representation, validation, inferencing,
explanation and maintenance

„ Wide perspective: KE describes the entire


process of developing and maintaining AI
systems
y

„ We use the Narrow Definition


– Involves the cooperation of human experts
– Synergistic effect

4
Knowledge
g Engineering
g g
Process Activities
„ Knowledge Acquisition
„ Knowledge Validation
„ Knowledge Representation
„ I f
Inferencing
i
„ Explanation and Justification

5
Knowledge
g Engineering
g g Process
(Figure 11.1)

Knowledge Sources of knowledge


validation (experts, others)
(test cases)
Knowledge
Acquisition
Encoding
Knowledge Knowledge
base Representation

Explanation
j tifi ti
justification

Inferencing

6
Scope of Knowledge

„ Knowledge acquisition is the extraction of


knowledge from sources of expertise and its
transfer to the knowledge base and sometimes
to the inference engine

„ Knowledge is a collection of specialized facts,


procedures
d andd jjudgment
d t rules
l

7
Knowledge Sources

„ Documented (books, manuals, etc.)


„ Undocumented (in people
people'ss minds)
– From people, from machines
„ K
Knowledge
l d Acquisition
A i iti from
f Databases
D t b
„ Knowledge Acquisition Via the Internet

8
Knowledge Levels
„ Shallow knowledge (surface)
„ D
Deep kno ledge
knowledge

„ Can implement
p a computerized
p representation
p that is deeper
p
than shallow knowledge
„ Special knowledge representation methods (semantic
networks and frames) to allow the implementation of
deeper-level reasoning (abstraction and analogy): important
expert activity
„ R
Represent t objects
bj t andd processes off th
the d
domain
i off expertise
ti
at this level
„ Relationships among objects are important

9
Major
j Categories
g of
Knowledge
„ Declarative Knowledge

„ Procedural Knowledge

„ Metaknowledge

10
Declarative Knowledge
Descriptive Representation of Knowledge

„ Expressed in a factual statement

„ Sh ll
Shallow

„ Important in the initial stage of


knowledge
g acquisition
q
11
Procedural Knowledge
„ Considers the manner in which things work under
different sets of circumstances
– Includes step-by-step sequences and how-to
types of instructions
– May also include explanations
– Involves automatic response to stimuli
– May tell how to use declarative knowledge and
h tto make
how k iinferences
f

12
„ Descriptive
D i ti knowledge
k l d relates
l t tot a
specific object. Includes information
about
b t ththe meaning,
i roles,
l environment,
i t
resources, activities, associations and
outcomes
t off the
th object
bj t

„ Procedural knowledge relates to the


y in the problem-
procedures employed
solving process

13
Metaknowledge

Knowledge about Knowledge

In ES, Metaknowledge refers to knowledge


about the operation of knowledge
knowledge-based
based
systems
I reasoning
Its i capabilities
bili i

14
Knowledge
g Acquisition
q
Difficulties
Problems in Transferring Knowledge

„ Expressing Knowledge
„ T
Transfer
f to a M
Machine
hi
„ Number of Participants
„ Structuring Knowledge

15
Other Reasons
„ Experts may lack time or not cooperate
„ Testing and refining knowledge is complicated
„ P l defined
Poorly d fi d methods
th d ffor kknowledge
l d elicitation
li it ti
„ System builders may collect knowledge from one source, but the
relevant knowledge may be scattered across several sources
„ May collect documented knowledge rather than use experts
„ The knowledge collected may be incomplete
„ Difficult to recognize specific knowledge when mixed with irrelevant
data
„ Experts may change their behavior when observed and/or
interviewed
„ Problematic interpersonal communication between the knowledge
engineer and the expert

16
Overcoming the Difficulties
„ Knowledge acquisition tools with ways to decrease the
representation mismatch between the human expert
and the program (“learning by being told”)
„ Simplified rule syntax
„ Natural language processor to translate knowledge to a
specific representation
„ Impacted by the role of the three major participants
– Knowledge Engineer
– Expert
– End user

17
„ Critical
C iti l
– The ability and personality of the knowledge
engineer
– Must develop a positive relationship with the
expert
– The knowledge engineer must create the
right impression
„ Computer-aided knowledge acquisition tools
„ Extensive integration of the acquisition efforts

18
Required Knowledge Engineer
Skills
„ Computer skills
„ Tolerance and ambivalence
„ Effective communication abilities
„ Broad educational background
„ Advanced, socially sophisticated verbal skills
„ Fast-learning capabilities (of different domains)
„ Must understand organizations and individuals
„ Wide experience in knowledge engineering
„ Intelligence
„ Empathy and patience
„ Persistence
„ Logical thinking
„ Versatility and inventiveness
„ Self-confidence
19
Knowledge
g Acquisition
q
Methods: An Overview
„ Manual

„ Semiautomatic

„ Automatic (Computer Aided)

20
Manual Methods - Structured
Around Interviews

„ Process ((Figure
g 11.4))
„ Interviewing
„ Tracking the Reasoning Process
„ Observing
„ Manual methods: slow, expensive and
sometimes inaccurate
21
Manual Methods of
Knowledge Acquisition

Experts
Coding
Knowledge Knowledge
engineer base

Documented
knowledge
g

22
Semiautomatic Methods

„ Support Experts Directly (Figure 11.5)


11 5)

„ H l Knowledge
Help K l d Engineers
E i

23
Expert-Driven
p
Knowledge Acquisition

Computer-aided Coding
Expert Knowledge
((interactive))
base
interviewing

Knowledge
engineer

24
Automatic Methods

„ Expert’s and/or the knowledge engineer’s


roles are minimized ((or eliminated))

„ Induction Method (Figure 11.6)


11 6)

25
Induction-Driven
Knowledge Acquisition

Case histories Induction Knowledge


and examples system base

26
Knowledge Modeling

„ The knowledge model views knowledge


acquisition as the construction of a model of
problem-solving behavior-- a model in
terms of knowledge instead of
representations

„ Can reuse models across applications

27
Interviews

„ Most Common Knowledge Acquisition:


Face-to-face interviews

„ Interview Types
– Unstructured (informal)
– Semi-structured
S i d
– Structured
28
Unstructured Interviews

„ Most Common Variations


– Talkthrough
– Teachthrough
– Readthrough
R dh h

29
„ The knowledge engineer slowly learns
about the problem
p
„ Then can build a representation of the
knowledge
g

„ Knowledge acquisition involves


– Uncovering important problem
attributes
tt ib t
– Making explicit the expert’s thought
process
30
Unstructured Interviews
„ Seldom provides complete or well-organized descriptions of
cognitive
g p
processes because
– The domains are generally complex
– The experts usually find it very difficult to express some
more important
i t t knowledge
k l d
– Domain experts may interpret the lack of structure as
requiring little preparation
– Data acquired are often unrelated, exist at varying levels
of complexity, and are difficult for the knowledge
engineer to review, interpret and integrate
– Few knowledge engineers can conduct an efficient
unstructured interview

31
St t d IInterviews
Structured t i

„ Systematic goal-oriented process


„ Forces an organized
g communication between
the knowledge engineer and the expert
„ Procedural Issues in Structuring an Interview

„ Interpersonal
p communication and analytical
y
skills are important

32
Interviews - Summary

„ Are important techniques


„ Must be planned carefully
„ Results must be verified and validated
„ A sometimes
Are i replaced
l d by b tracking
ki
methods
„ Can supplement tracking or other
knowledge acquisition methods
33
Recommendation
Before a knowledge engineer interviews the expert(s)
1. Interview a less knowledgeable (minor) expert
– Helps the knowledge engineer
• Learn about the problem
• Learn its significance
• Learn about the expert(s)
• Learn who the users will be
• Understand the basic terminology
• Identify readable sources
2. Next read about the problem
3. Then, interview the expert(s) (much more effectively)

34
Tracking Methods

„ Techniques that attempt to track the


reasoningg process
p of an expert
p
„ From cognitive psychology
„ Most common formal method:
Protocol Analysis

35
Protocol Analysis

„ Protocol: a record or documentation of


the expert's
p step-by-step
p y p information
processing and decision-making behavior
„ The expert performs a real task and
verbalizes his/her thought process (think
aloud)

36
Observations and Other
Manual Methods

„ Observations

„ Ob
Observe the
h EExpert W
Work
k

37
Other Manual Methods
„ Case analysis
„ Critical incident analysis
„ Discussions with the users
„ Commentaries
„ Conceptual graphs and models
„ Brainstorming
„ Prototyping
„ M l idi
Multidimensional
i l scaling
li
„ Johnson's hierarchical clustering
„ Performance review
38
Expert driven Methods
Expert-driven
„ Knowledge Engineers Typically
– Lack Knowledge About the Domain
– Are Expensive
– May Have Problems Communicating With
Experts
„ Knowledge Acquisition May be Slow, Expensive
and Unreliable
„ Can Experts Be Their Own Knowledge Engineers?

39
Approaches
pp to
Expert-Driven Systems

„ Manual

„ C
Computer-Aided
Aid d (S
(Semiautomatic)
i i )

40
Manual Method:
Expert's Self-reports
Problems with Experts’ Reports and Questionnaires
1. Requires the expert to act as knowledge engineer
2. Reports are biased
3. Experts often describe new and untested ideas and
strategies
4. Experts lose interest rapidly
5. Experts must be proficient in flowcharting
6 Experts
6. E may forget
f certain
i knowledge
k l d
7. Experts are likely to be vague

41
Benefits

„ May provide useful preliminary


knowledgeg discovery
y and acquisition
q
„ Computer support can eliminate some
limitations

42
Computer aided Approaches
Computer-aided
„ To reduce or eliminate the potential problems
– REFINER+ - case-based system
– TIGON - to detect and diagnose faults in a gas
turbine engine
„ Oth
Other
– Visual modeling techniques
– New machine learning methods to induce decision
trees and rules
– Tools based on repertory
p y ggrid analysis
y
43
Repertory
p y Grid Analysis
y
(RGA)
„ Techniques, derived from psychology
„ Use the classification interview
„ Fairly structured
„ P i
Primary Method:
M h d
Repertory Grid Analysis (RGA)

44
The Grid
„ Based on Kelly's model of human thinking: Personal
Construct Theory (PCT)
„ Each person is a "personal scientist" seeking to predict
and control events by
– Forming Theories
– Testing Hypotheses
– Analyzing Results of Experiments
„ Knowledge and perceptions about the world (a domain or
problem) are classified and categorized by each individual
as a personal, perceptual model
„ Each individual anticipates and then acts

45
How RGA Works
1. The expert
p identifies the important
p objects
j in the domain
of expertise (interview)
2. The expert identifies the important attributes
3. For each attribute, the expert is asked to establish a
bipolar scale with distinguishable characteristics
(traits)
(t a ts) aand
d ttheir
e oppos
opposites
tes
4. The interviewer picks any three of the objects and asks:
What attributes and traits distinguish any two of these
objects
bj t from
f the
th thi
third?
d? T
Translate
l t answers on a scalel off
1-3 (or 1-5)

46
R G A In p u t fo r S e le c t i n g a C o m p u t e r L a n g u a g e

At t r i b u t e s Tr a i t Op p o s ite

A a i lla b i li t y
Av Wi d e ly
l a v a i la
l b le
l N o t a v a i lla b le
l

Ease of H igh Lo w
p ro g ra m m in g

Tra in in g tim e Lo w High

Orie n ta tio n S y m b o li c N u m e ric

47
„ Step 4 continues for several triplets of
objects
„ A
Answers recorded
d d iin a Grid
G id
„ Expert may change the ratings inside box
„ Can use the grid for recommendations

48
Example of a Grid

Ease of
Attribute Orientation Program-
g Training
g Availa-
ming Time bility

Trait Symbolic (3) High (3) High (1) High (3)


Opposite Numeric (1) Low (1) Low (3) Low (1)

LISP 3 3 1 1

PROLOG 3 2 2 1

C ++ 3 2 2 3

COBOL 1 2 1 3

49
RGA in Expert Systems - Tools

„ AQUINAS
– Including the Expertise Transfer
System (ETS)

„ KRITON

50
Other RGA Tools

„ PCGRID (PC-based)

„ WebGrid

„ Circumgrids

51
Knowledge Engineer Support

„ Knowledge Acquisition Aids


„ Special Languages
„ Editors and Interfaces
„ E l
Explanation
i Facility
F ili
„ Revision of the Knowledge Base
„ Pictorial Knowledge Acquisition (PIKA)

52
„ Integrated
g Knowledge
g Acquisition
q Aids
– PROTÉGÉ-II
– KSM
– ACQUIRE
– KADS (Knowledge Acquisition and
Documentation System)
„ Front-end Tools
– Knowledge
ow edge Analysis
a ys s Tool
oo (KAT)
( )
– NEXTRA (in Nexpert Object)

53
Machine Learning: Rule Induction,
C
Case-based
b dR Reasoning,
i N
Neurall
Computing, and Intelligent Agents
„ Manual and semiautomatic elicitation methods: slow and expensive
„ Other Deficiencies
– Frequently weak correlation between verbal reports and mental
behavior
– Sometimes experts cannot describe their decision making process
– System
S quality
li depends
d d too much h on the
h quality
li off the
h expert and
d the
h
knowledge engineer
– The expert does not understand ES technology
– The knowledge
kno ledge engineer mayma not understand
nderstand the business
b siness problem
– Can be difficult to validate acquired knowledge

54
Computer-aided Knowledge
Acquisition or Automated
Acquisition,
Knowledge
g Acquisition
q
Objectives
„ IIncrease th
the productivity
d ti it off kknowledge
l d engineering
i i
„ Reduce the required knowledge engineer’s skill level
„ Eliminate (mostly) the need for an expert
„ Eliminate (mostly) the need for a knowledge engineer
„ Increase the q
quality
y of the acquired
q knowledge
g

55
Automated Knowledge
A
Acquisition
i iti (Machine
(M hi
Learning)
„ Rule Induction
„ Case-based Reasoning
„ Neural Computing
„ Intelligent Agents

56
Machine Learning

„ Knowledge Discovery and Data Mining


„ Include Methods for Readingg Documents and
Inducing Knowledge (Rules)
„ Other Knowledge Sources (Databases)
„ Tools
– KATE-Induction
– CN-2

57
Automated Rule Induction

„ Induction: Process of Reasoning from


Specific
p to General
„ In ES: Rules Generated by a Computer
Program from Cases
„ Interactive Induction

58
TABLE 13.6 Case for Induction - A Knowledge Map

(Induction Table)

Attributes

Annual
Applicant Income ($) Assets ($) Age Dependents Decision

Mr. White 50,000 100,000 30 3 Yes

Ms. Green 70,000 None 35 1 Yes

Mr. Smith 40,000 None 33 2 No

Ms. Rich 30,000 250,000 42 0 Yes

59
Case based Reasoning (CBR)
Case-based

„ For Building ES by Accessing Problem-


solvingg Experiences
p for Inferring
g
Solutions for Solving Future Problems

„ Cases and Resolutions Constitute a


Knowledge Base

60
Neural Computing

„ Fairly Narrow Domains with Pattern


Recognition
g

„ Requires a Large Volume of Historical


Cases

61
Intelligent
g Agents
g for
Knowledge Acquisition
Led to

„ KQML (Knowledge Query and Manipulation


Language) for Knowledge Sharing

„ KIF,, Knowledge
g Interchange
g Format ((Amongg
Disparate Programs)

62
Selecting an Appropriate
Knowledge Acquisition Method
„ Ideal Knowledge Acquisition System Objectives
– Direct interaction with the expert without a knowledge
engineer
– Applicability to virtually unlimited problem domains
– Tutorial capabilities
p
– Ability to analyze work in progress to detect inconsistencies
and gaps in knowledge
– Ability to incorporate multiple knowledge sources
– A user friendly interface
– Easy interface with different expert system tools
„ H b id A
Hybrid Acquisition
i i i - Another
A h A Approach h
63
Knowledge
g Acquisition
q
from Multiple Experts
„ Major Purposes of Using Multiple Experts
– Better understand the knowledge domain
– Improve
I k
knowledge
l d b base validity,
lidit consistency,
it
completeness, accuracy and relevancy
– Provide better productivity
– Identify incorrect results more easily
– Address broader domains
– To
T handle
h dl more complex l problems
bl andd combine
bi th
the
strengths of different reasoning approaches
„ Benefits And Problems With Multiple Experts

64
Handling Multiple Expertise
„ Blend
Bl d severall lines
li off reasoning
i through
th h
consensus methods
„ Use an analytical approach (group probability)
„ Select one of several distinct lines of reasoning
„ A t
Automate t th
the process
„ Decompose the knowledge acquired into
specialized knowledge sources

65
Validation and Verification of
the Knowledge Base

„ Quality
Q lit Control
C t l
– Evaluation
– Validation
– Verification

66
„ Evaluation
– Assess an expert system's overall value
– Analyze whether the system would be usable,
efficient and cost-effective
„ Validation
– Deals with the pperformance
f of the system
y ((compared
p
to the expert's)
– Was the “right” system built (acceptable level of
accuracy?)?)
„ Verification
– Was the system built "right"?
right ?
– Was the system correctly implemented to
specifications?

67
Dynamic Activities

„ Repeated each prototype update


„ For the Knowledgeg Base
– Must have the right knowledge base
– Must be constructed properly (verification)
„ Activities and Concepts In Performing These
Quality Control Tasks

68
To Validate an ES
„ Test
1. The extent to which the system and
th expertt d
the decisions
ii agree
2. The inputs and processes used by an
expert compared to the machine
3. The difference between expert
p and
novice decisions
(Sturman and Milkovich [1995])

69
Analyzing, Coding,
Documenting and
Documenting,
Diagramming
g g
Method of Acquisition and Representation
1. Transcription
2. Phrase Indexingg
3. Knowledge Coding
4 Documentation
4.
(Wolfram et al. [1987])

70
Knowledge Diagramming
„ Graphical, hierarchical, top-down description of the knowledge that
describes facts and reasoningg strategies
g in ES
„ Types
– Objects
– Events
– Performance
– Metaknowledge
„ Describes the linkages
g and interactions among g knowledgeg types
yp
„ Supports the analysis and planning of subsequent acquisitions
„ Called conceptual graphs (CG)
„ Useful in analyzing
y g acquired
q knowledgeg

71
Numeric and Documented
Knowledge Acquisition
„ Acquisition of Numeric Knowledge
– Special approach needed to capture numeric knowledge
„ A
Acquisition
i iti off Documented
D t d Knowledge
K l d
– Major Advantage: No Expert
– To Handle a Large
g or Complex
p Amount of Information
– New Field: New Methods That Interpret Meaning to
Determine
• Rules
• Other Knowledge Forms (Frames for Case-Based Reasoning)

72
Knowledge
g Acquisition
q and the
Internet/Intranet
„ Hypermedia
H di (Web)
(W b) tto R
Representt E
Expertise
ti
Naturally

„ Natural Links can be Created in the Knowledge

„ CONCORDE: Hypertext-based Knowledge


Acquisition System
Hypertext links are created as knowledge objects are
q
acquired
73
The Internet/Intranet for
Knowledge Acquisition
„ Electronic Interviewing
„ Experts can Validate and Maintain Knowledge Bases
„ Documented Knowledge can be accessed
„ The Problem: Identifying relevant knowledge (intelligent
agents)
„ Many Web Search Engines have intelligent agents
„ Data Fusion Agent for multiple Web searches and organizing
„ Automated Collaborative Filtering (ACF) statistically matches
peoples’ evaluations of a set of objects
peoples

74
Also

„ WebGrid: Web-based Knowledge


Elicitation Approaches
pp

„ Plus Information Structuring in


Distributed Hypermedia Systems

75
Induction Table Example

„ Induction tables (knowledge maps) focus


the knowledge
g acquisition
q p
process

„ Choosing a hospital clinic facility site

76
Induction Table (Knowledge Map) Example

Population Density Number of Average Near Public Decision


Density over How Near (within 2 Family Transportation? (Choices)
Many Sq. miles) Income
mi Competitors
People / Numeric, 0, 1, 2, 3, ... Numeric, Yes, No Yes, No
Square Mile Region Size $ / Year
>= 2000 >=4 0 Yes

>=3500 >=4 1 Yes

>=2 No

<30,000 No

77
„ Row 1: Factors
„ Row 2: Valid Factor Values and Choices
(last column)

„ Table
T bl lleads
d tto th
the prototype
t t ES
„ Each row becomes a potential rule
„ Induction tables can be used to encode
chains of knowledge

78
Class Exercise: Animals
„ Knowledge Acquisition
„ Create Induction Table
– I am thinking of an animal!
– Question: Does it have a long neck? If yes, THEN Guess that it
is a giraffe.
– IF not a giraffe, then ask for a question to distinguish between
the two. Is it YES or NO for a giraffe? Fill in the new Factor,
Values and Rule.
– IF no, THEN What is the animal? and fill in the new rule.
– Continue with all questions
– You will build a table very quickly

79

You might also like