Professional Documents
Culture Documents
Midterm Revision
Agenda
1. AI Overview
2. Knowledge Engineering & Expert system
3. Machine Learning & Examples
1- AI Overview
What Is AI?
The birth year of AI is 1956
Artificial Intelligence (AI) is a new technical science that studies and
develops theories, methods, techniques,
and application systems for simulating and extending human intelligence.
In 1956, the concept of AI was first proposed by John McCarthy, who defined
the subject as:
"science and engineering of making intelligent machines, especially intelligent
computer program". Brain
science Cognitive
science
AI is concerned with making machines work in an intelligent way,
Computersimilar to the
science
way that the human mind works. Psychology
AI
Philosophy
Linguistics
Logic
1950
1980
2010
Relationship of AI, Machine Learning and Deep Learning
• Representative of connectionism:
neural networks and deep
learning
Three Major Schools of Thought: Behaviorism
Basic thoughts:
Intelligence depends on perception and action.
The perception-action mode of intelligent behavior is proposed.
Intelligence requires no knowledge, representation, or inference.
AI can evolve like human intelligence. Intelligent behavior can only be
demonstrated in the real world through the constant interaction with
the surrounding environment.
Representative of behaviorism:
behavior control, adaptation, and evolutionary computing
Types of AI
Strong AI
The strong AI view holds that it is possible to create intelligent
machines that can really reason and solve problems.
Such machines are considered to be conscious and self-aware, can
independently think about problems and work out optimal solutions
to problems, have their own system of values and world views, and
have all the same instincts as living things, such as survival and
security needs.
It can be regarded as a new civilization in a certain sense.
Weak AI
The weak AI view holds that intelligent machines
cannot really reason and solve problems.
These machines only look intelligent, but do not have real
intelligence or self-awareness, Ex: Siri.
Classification of Intelligent Robots
Logical Representation
Semantic Network Representation
Frame Representation
Production Rules
Logical Representation
IF <antecedent>
THEN <consequent>
Production Rules(Cont.)
IF customer is teenager
AND ‘cash withdrawal’ > 1000
THEN ‘signature of the parent’ is required
Production Rules(Cont.)
Relation
IF the ‘fuel tank’ is empty
THEN the car is dead
Recommendation
IF the season is autumn
AND the sky is cloudy
AND the forecast is drizzle
THEN the advice is ‘take an umbrella’
Directive
IF the car is dead
AND the ‘fuel tank’ is empty
THEN the action is ‘refuel the car’
Production Rules(Cont.)
Strategy
IF the car is dead
THEN the action is ‘check the fuel tank’
step1 is complete
IF step1 is complete
AND the ‘fuel tank’ is full
THEN the action is ‘check the battery’
step2 is complete
Heuristic
IF the spill is liquid
AND the ‘spill pH’ < 6
AND the ‘spill smell’ is vinegar
THEN the ‘spill material’ is ‘acetic acid’
Rule-based System
Complex
Machine learning
Manual rules
Rule complexity
algorithms
Simple
Rule-based
Simple problems
algorithms
Small Large
Thus the inference engine puts aside the rule it is working with
(the rule is said to stack) and sets up a new goal, a sub goal,
to prove the IF part of this rule.
Then the knowledge base is searched again for rules that can
prove the sub goal.
The inference engine repeats the process of stacking the rules
until no rules are found in knowledge base to prove the current
sub goal.
Backward Chaining(Cont.)
Conflict Resolution
Relation
what are the different types of Production-Rule? Directive
Heuristic
Recommendation
Strategy
Examples(Cont.)
Mention the Knowledge Representation Techniques?
Logical Representation
Semantic Network Representation
Frame Representation
Production Rules
Knowledge
Examples(Cont.)
Knowledge representation technique that is defined as IF-THEN structure …
Production Rule
A computer program that emulates the decision making capability of a human expert …
Expert System
Forward chaining
3- Machine Learning Overview
Main Problems Solved by Machine Learning
Classification: A computer program needs to specify which of the k categories some input belongs to.
The output of classification is discrete category values
For example, the image classification algorithm in computer vision.
Regression: For this type of task, a computer program predicts the output for the given input.
An example of this task type is to predict the claim amount of an insured person (to set the insurance
premium) or predict the security price.
the output of regression is continuous numbers.
Clustering: A large amount of data from an unlabeled dataset is divided into multiple categories
according to internal similarity of the data.
Data in the same category is more similar than that in different categories.
This feature can be used in scenarios such as image retrieval and user profile management.
Machine Learning Classification
Supervised learning: For labeled samples , Obtain an optimal model with
required performance through training and learning based on the samples
of known categories.
Unsupervised learning: For unlabeled samples, the learning algorithms
directly model the input datasets. Clustering is a common form of
unsupervised learning. We only need to put highly similar samples
together, calculate the similarity between new samples and existing ones,
and classify them by similarity.
Semi-supervised learning: In one task, a machine learning model that
automatically uses a large amount of unlabeled data to assist learning
directly of a small amount of labeled data.
Reinforcement learning: It is an area of machine learning concerned with
how agents ought to take actions in an environment to maximize some
notion of cumulative reward. The difference between reinforcement
learning and supervised learning is the teacher signal. The reinforcement
Machine Learning Process
Feedback and
iteration
Importance of Data Preprocessing
Data is crucial to models. It is the ceiling of model capabilities. Without good data,
there is no good model.
Data
Data Data normalizatio
cleansing
preprocessing n
Fill in missing values, Normalize data to reduce
and detect and noise and improve model
eliminate causes of accuracy.
dataset exceptions.
Data dimension
reduction
Simplify data attributes to avoid
dimension explosion.
Dirty Data
#Stude
# Id Name Birthday Gender Is Teacher Country City
nts
Avoid
Improve model generalization and
dimension avoid overfitting
explosion
Feature Selection Methods - Filter
Filter methods are independent of the model during feature selection.
Limitations
• The filter method tends to select
redundant variables as the relationship
Feature Selection Methods - Wrapper
Wrapper methods use a prediction model to score feature subsets.
Common methods
Procedure of a
wrapper method
• Recursive feature elimination (RFE)
Limitations
• Wrapper methods train a new model for each
subset, resulting in a huge number of
computations.
• A feature set with the best performance is usually
provided for a specific type of model.
Feature Selection Methods - Embedded
Embedded methods consider feature selection as a part of model
construction.
The most common type of embedded
feature selection method is the
regularization method.
Select the optimal feature subset
Regularization methods are also called
penalization methods that introduce
Traverse all Generate a Train models
features feature subset + Evaluate the
effect
additional constraints into the
optimization of a predictive algorithm that
Procedure of an embedded method
bias the model toward lower complexity
and reduce the number of features.
Common methods
Model Validity
Generalization capability: The goal of machine learning is that the
model obtained after learning should perform well on new samples,
not just on samples used for training. The capability of applying a
model to new samples is called generalization or robustness.
Underfitting: occurs when the model or the algorithm does not fit
the data well enough.
Model Validity(Cont.)
Testing error
Erro
r
Training error
Model Complexity
Machine Learning Performance Evaluation - Regression
chine Learning Performance Evaluation - Classification Confusion matrix
Estimated
amount
Total
Actual
amount
yes
no
Total
chine Learning Performance Evaluation - Classification
Estimated
amount
Total
Actual
amount
yes
no
Total
Confusion matrix
Machine Learning Performance Evaluation - Classification
(Given)
Measurement Ratio
Precision
Parameters and Hyperparameters in Models
Grid search
Random search
Heuristic intelligent search
Bayesian search
Linear Regression and Overfitting Prevention
Logistic Regression Extension - Softmax Function
Male? Orange?
Apple?
Female? Banana?
Decision Tree Structure
Root
Node
Internal Internal
Node Node
Internal
Leaf Leaf Leaf
Node
Node Node Node
It is an area of machine learning concerned with how agents ought to take actions in an
environment to maximize some notion of cumulative reward …
Reinforcement learning
Examples(Cont.)
A science that studies how computers simulate or implement human learning
behavior to acquire new knowledge …
Machine Learning
Semi-supervised learning
chine Learning Performance Evaluation - Classification
Estimated
amount
Total
Actual
amount
yes
no
Total
Confusion matrix
Example of Machine Learning Performance Evaluation
Estimated
amount
Total
Actual
amount
140 30 170
20 10 30
x y Y Prediction
3 1 4 -3 9
21 10 22 -12 144
22 14 23 -9 81
34 34 35 -1 1
54 44 55 -11 121
Overall Entropy
Decision column consists of 14 instances and includes two labels: yes and no.
There are 9 decisions labeled yes, and 5 decisions labeled no.
Entropy(Decision) = – p(Yes) . log2p(Yes) – p(No) . log2p(No)
Entropy(Decision) = – (9/14) . log2(9/14) – (5/14) . log2(5/14) = 0.940
Gain(S, A) = Entropy(S) – ∑ [ p(S|A) . Entropy(S|A) ]
ID3 Algorithm (cont.)
Gain(S, Gender) = Total Entropy – ( 6/14 x (Entropy Female)) – (8/14 x (Entropy Male) )
=
= 0.048