AIML Lect1 Introduction

Artificial Intelligence and
Machine Learning (CSET301)

Labs will be in
Google Colab.
Labs
For project you will
get access of DGX,
request basis.
Artificial Intelligence
not a new term …
Obvious Questions
Programs that behave externally like

What humans?
is AI? Programs that operate internally as
humans do?
Computational systems that behave
intelligently?
Turing Test
• Human beings are intelligent Alan Turing
• To be called intelligent, a machine

must produce responses that are
indistinguishable from those of a
human
Artificial Intelligence in 1950
Machine Learning
• Learning is any process by which a system improves
performance from experience.” (Herbert Simon)
• Machine Learning
Machine learning is the science of getting computers to act
(learn) without being explicitly programmed from a
given set of data to achieve a desirable outcome.
– a machine that learns on its own
• Machine Learning (Tom Mitchell (1998)) is the study of
algorithms that
• improve their performance P
• at some task T
• with experience E.
• A well-defined learning task is given by <P, T, E>.
Why Machine Learning ?
• Learning is used when:
– Human expertise does not exist (navigating on Mars),
– Humans are unable to explain their expertise (speech
recognition)
– Solution changes in time (routing on a computer
network)
– Solution needs to be adapted to particular cases (user
biometrics)
• Develop systems that can automatically adapt and
customize themselves to individual users.
• Discover new knowledge from large databases
10
Defining the Learning Task
Improve on task T, with respect to performance metric P, based on
experience E
T: Playing checkers
P: Percentage of games won against an arbitrary opponent
E: Playing practice games against itself
T: Recognizing hand-written words
P: Percentage of words correctly classified
E: Database of human-labeled images of handwritten words
T: Driving on four-lane highways using vision sensors
P: Average distance traveled before a human-judged error
E: A sequence of images and steering commands recorded while
observing a human driver.
T: Categorize email messages as spam or legitimate.
10
P: Percentage of email messages correctly classified.
E: Database of emails, some with human-given labels
Traditional Programming
Data
Output
Program(Features)
Machine Learning
Data
Program(Features)
Output
4
A classic example of a task that requires machine learning:
It is very hard to say what makes a 2
6
• Statistics quantifies numbers
• Data Mining explains patterns
• Machine Learning predicts with models
• Artificial Intelligence behaves and reasons
14
1956
Birth of AI, early successes
Checkers (1952): Samuel’s program learned
weights and played at strong amateur level
Problem solving (1955): Newell & Simon’s Logic

The- orist: prove theorems in Principia
Mathematica using search + heuristics; later,
General Problem Solver (GPS)
10
Overwhelming optimism...
Machines will be capable, within twenty years, of doing any
work a man can do. (Herbert Simon)
Within 10 years the problems of artificial intelligence will be

substantially solved. (Marvin Minsky)
I visualize a time when we will be to robots what dogs are to
humans, and I’m rooting for the machines. (Claude Shannon)
12
...underwhelming results
Example: machine translation
The spirit is willing but the flesh is weak.
(Russian)
The vodka is good but the meat is rotten.
1966: ALPAC report cut off government funding for MT
14
Implications of early era
Problems:
• Limited computation: search space grew exponentially, outpac-
ing hardware (100! ≈ 10157 > 1080)
• Limited information: complexity of AI problems (number
of words, objects, concepts in the world)
Contributions:
• Lisp, garbage collection, time-sharing (John McCarthy)
• Key paradigm: separate modeling and inference
16
Knowledge-based systems (70-80s)
Expert systems: Elicit specific domain knowledge from experts

in form of rules:
if [premises] then [conclusion]
18
Knowledge-based systems (70-80s)
DENDRAL: infer molecular structure from mass

spectrometry
MYCIN: diagnose blood infections, recommend

antibiotics
XCON: convert customer orders into parts

specification; save DEC $40 million a year by1986
20
Knowledge-based systems
Contributions:
• First real application that impacted industry
• Knowledge helped curb the exponential growth
Problems:
• Knowledge is not deterministic rules, need to model
uncertainty
• Requires considerable manual effort to create rules, hard to
maintain
1987:Collapse of Lisp machines

22
1943
Artificial neural networks
1943: introduced artificial neural networks, connect neu-
ral circuitry and logic (McCulloch/Pitts)
1969: Perceptrons book showed that linear models could

not solve XOR, killed neural nets research (Minsky)
Training networks
1986: popularization of backpropagation for training
multi-layer networks (Rumelhardt, Hinton, Williams)
1989: applied convolutional neural networks to

recognizing handwritten digits for USPS (LeCun)
27
Deep learning
AlexNet (2012): huge gains in object recognition; trans-
formed computer vision community overnight
AlphaGo (2016): deep reinforcement learning, defeat

world champion Lee Sedol
29
A melting pot
• Bayes rule (Bayes, 1763) from probability
• Least squares regression (Gauss, 1795) from astronomy
• First-order logic (Frege, 1893) from logic
• Maximum likelihood (Fisher, 1922) from statistics
• Artificial neural networks (McCulloch/Pitts,
1943) from neuro-science
• Minimax games (von Neumann, 1944) fromeconomics
• Stochastic gradient descent (Robbins/Monro,1951) from opti-
mization
• Uniform cost search (Dijkstra, 1956) from algorithms
• Value iteration (Bellman, 1957) from control theory
33
Two broad views of AI
• AI agents: How can we create intelligence?

• AI tools: How can we benefit society?
An intelligent agent (human)
Perception Robotics Language
(actions) (communicate)
Knowledge Reasoning Learning

(inferences and make
decisions )
Machine (AI agents) vs Human
Huge Gap: Between humans and machines operate in.
Machine: narrow tasks, millions of examples.

• AlphaGo learned from 19.6 million games, but
can only do one thing: play Go
Human: diverse tasks, very few examples

• hand learn from a much wider set of experiences and
can do many things.
41
Paradigm
Modeling
Inference Learning
59
Paradigm: Modeling
Real world
• formulate this as a graph

Modeling • Nodes➔ points in the city,
• edges ➔roads,
5
6 7
4 • Weight➔ traffic on that road.
5 5 3 1
8 6 3
Model 8
0
8 1 1
7 2
7 2 3 6
4
8
6
61
Paradigm: inference
Inference: is to answer questions with respect to the model.
Focus of inference is usually on efficient algorithms that can answer
these questions.
6 7
4
5
5 5 3 1
8 6 3
Model 8
0
8 1 1
7 2
7 2 3 6
4
8
6
Inference
6 7
4
5
5 5 3 1
8 3
6
Predictions 8
0
8 1 1
7 2
7 2 3 6
4
8
6
Paradigm: learning
? ?
?
?
? ? ? ?
? ? ?
Model without parameters ?
?
? ? ?
? ?
? ? ? ?
?
?
?
+data
we have the right type of data, we can run a
machine learning algorithm to tune the
Learningparameters of the model
6 7
4
5
5 5 3 1
8 6 3
Model with parameters 8
0
8 1 1
7 2
7 2 3 6
4
8
6
65
Machine learning
Data Model
• The main driver of recent successes in AI
• Move from ”code” to ”data” to manage the

information complexity
• Requires a leap of faith: generalization
69
Type of Data
• Relational Data (Tables/Transaction/etc)
• Text Data (Web)
• Semi-structured Data (XML, JSON)
• Graph Data
– Social Network, Semantic Web, …
• Streaming Data
– Network traffic, sensor data,…
• etc
36
Types of Learning
• Supervised Learning,
– Classification,
– Regression, etc.
• Unsupervised Learning,
• Semi-Supervised Learning,
• Etc.
37
Types of Learning
Reinforcement Learning
Supervised
Learning
Unsupervised
Learning
23
From Gartner, Recht
Supervised Learning: Uses
• Prediction of future cases: Use the rule to
predict the output of future inputs
• Knowledge extraction: The rule is easy to
understand
• Outlier detection: Exceptions not covered by
the rule, e.g., fraud
39
Classification
• Example: Credit
scoring
• Differentiating
between low-risk
and high-risk
customers from
their income and
savings
Discriminant: IF income > θ1 AND savings > θ2

THEN low-risk ELSE high-risk 40
Classification: Applications
• Face recognition: Pose,make-up, hair style
• Character recognition: Different handwriting
styles.
• Medical diagnosis: From symptoms to illnesses
• Biometrics: Recognition/authentication using
physical and/or behavioral characteristics: Face,
iris, signature, etc.
41
Regression
• Example: Price of a
used car
• x : car attributes
y = wx+w0
y : price
y = g (x | q )
g ( ) model,
q parameters
42
Unsupervised Learning
• Clustering: Grouping similar instances
• Some applications
– Customer segmentation
– Image compression
43
Unsupervised Learning
Organize computing clusters Social network analysis
Image credit: NASA/JPL-Caltech/E. Churchwell (Univ. of Wisconsin, Madison)
Market segmentation Astronomical data analysis 25

Slide credit: Andrew Ng
Reinforcement Learning
• Given a sequence of states and actions with
(delayed) rewards, output a policy
– Policy is a mapping from states ➔ actions that
tells you what to do in a given state
• Examples:
– Credit assignment problem
– Game playing
– Robot in a maze
29
Semi-Supervised Learning
• This learning technique uses labeled as well as un-
labeled data.
• Label data is in small quantity while un-labeled data is
in large amount.
• First un-supervised algorithm forms groups (clusters)
of un-labeled data and then existing labeled data is
used to label the clustered un-labeled data.
• Elements closer (similar) to each other are more likely
to have the same output label.
46
Student example
• Supervised learning: faculty supervision all the
times (data?)
• Unsupervised learning: student has to figure out a
concept himself (data?)
• Semi-Supervised learning:
– SL: faculty teaches some concepts in class
– SSL: student solves homework questions based
on similar concepts taught by faculty in class.
47
Common tasks
• Description
• Estimation
• Prediction
• Classification
• Clustering
• Association
48
Description
• Find ways to describe patterns and trends lying
within data.
• For example, a pollster (a person who conducts
or analyses opinion polls) may uncover
evidence that those who have been laid off are
less likely to support the present prime minster
in the election.
• Decision trees provide an intuitive and human
friendly explanation of their results
49
Estimation
• Estimation is similar to classification except that the target
variable is numerical rather than categorical (divided into
groups).
• For example, we might be interested in estimating the
systolic blood pressure reading of a hospital patient, based
on the patient’s age, gender, body-mass index, and blood
sodium levels.
• Estimation model can be used to new cases.
• Linear Regression, Neural networks
50
Estimation Examples
• Estimating the amount of money a randomly chosen family
of four will spend for back-to-school shopping this winter.
• Estimating the CGPA of a graduate student, based on that
student’s undergraduate CGPA.
51
Prediction
• Prediction is similar to classification and estimation, except
that for prediction, the results lie in the future.
– Predicting the price of a stock three months into the
future
– Predicting the percentage increase in traffic deaths next
year if the speed limit is increased
– Predicting whether a particular molecule in COVID-2019
drug discovery will lead to a profitable new drug for a
pharmaceutical company
52
Classification
• In classification, there is a target categorical variable, such
as income bracket that can partitioned into different
classes or categories:
– High income,
– Middle income, and
– Low income.
53
Clustering
• Clustering refers to the grouping of records,
observations, or cases into groups of similar objects
54
Association
• The association task is the job of finding which attributes
“go together”
• Finding out which items in a supermarket are purchased
together and which items are never purchased together
• Apriori algorithm
55
Identify the relevant task
• The present India PM party would like to approximate
how many seats their next opponent party will get in
coming election.
– Estimation: estimating the number of seats (numeric
target).
56
Identify the relevant task Cont’d
• A political strategist is seeking the groups for donations
for his party in coming elections.
– Clustering: examine the profile of each homogeneous
group derived from a particular state’s population;
– Association: discover interesting rules pertaining to a
large proportion of the population
57
Identify the relevant task Cont’d
• Investigating the proportion of subscribers to a
company’s cell phone plan that respond positively to
an offer of a service upgrade.
• Predicting degradation in telecommunications
networks
• Examining the proportion of children whose parents
read to them who are themselves good readers
• Determining the proportion of cases in which a new
drug will exhibit dangerous side effects
58
The Agent-Environment Interface
30
Reflex
“Low-level AI and Machine learning “High-level
intelligence” intelligence”
Reflex-based models
• A reflex-based model simply performs a fixed
sequence of computations on a given input.
• Common models in machine learning
• Examples: linear classifiers, deep neural networks
• Fully feed-forward (no backtracking)
73
Search problems
Markov decision processes
Adversarial games
Reflex States

75
State-based models
Search problems: You control everything
Markov decision processes:
Adversarial games: against opponent (e.g., chess)
79
State-based models
White to move
State-based models
To model the state and transitions between states that are
triggered by actions. G(V, E)
States ➔ Nodes
Transitions ➔Edges.
In state-based models, solutions are procedural.
Applications:
• Games:Chess, Go, Pac-Man, Starcraft, etc.
• Robotics: Motion planning
• Natural language generation: Machine translation, image
captioning
Search problems
Markov decision processes Constraint satisfaction
problems
Adversarial games Bayesian networks
Reflex States Variables

75
Sudoku
Some applications the order in which things are done isn’t important
Goal: put digits in blank squares so each row, column, and 3x3 sub-block
has digits 1–9
order of filling squares doesn’t matter in the evaluation criteria

Variable-based models
Constraint satisfaction problems: hard constraints
Sudoku, scheduling
Bayesian networks: soft dependencies (where variables are random

variables which are dependent on each other)
Search problems
Markov decision processes Constraint satisfaction problems
Reflex States Variables Logic

”Low-level intelligence” ”High-level intelligence”
AI and Machine learning
Search problems
Markov decision processes Constraint satisfaction
problems
Reflex States Variables Logic
75
Motivation: virtual assistant
Tell information Ask questions
Use natural language

Need to:
• Digest heterogenous information
• Reason deeply with that information
Optimization
Discrete optimization: find the best discrete object
min Cost(p)
p∈Paths
Algorithmic tool: dynamic programming
Continuous optimization: find the best vector of real numbers
min TrainingError(w)
w∈Rd
Algorithmic tool: gradient descent

AIML Lect1 Introduction

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AIML Lect1 Introduction

Uploaded by

Copyright:

Available Formats

Artificial Intelligence and

Machine Learning (CSET301)

Programs that behave externally like

• Human beings are intelligent Alan Turing

• To be called intelligent, a machine

Problem solving (1955): Newell & Simon’s Logic

Within 10 years the problems of artificial intelligence will be

The spirit is willing but the flesh is weak.

The vodka is good but the meat is rotten.

1966: ALPAC report cut off government funding for MT

Expert systems: Elicit specific domain knowledge from experts

if [premises] then [conclusion]

DENDRAL: infer molecular structure from mass

MYCIN: diagnose blood infections, recommend

XCON: convert customer orders into parts

1987:Collapse of Lisp machines

1969: Perceptrons book showed that linear models could

1989: applied convolutional neural networks to

AlphaGo (2016): deep reinforcement learning, defeat

• AI agents: How can we create intelligence?

Knowledge Reasoning Learning

Machine: narrow tasks, millions of examples.

Human: diverse tasks, very few examples

• formulate this as a graph

• The main driver of recent successes in AI

• Move from ”code” to ”data” to manage the

• Requires a leap of faith: generalization

Discriminant: IF income > θ1 AND savings > θ2

Organize computing clusters Social network analysis

Image credit: NASA/JPL-Caltech/E. Churchwell (Univ. of Wisconsin, Madison)

Market segmentation Astronomical data analysis 25

“Low-level AI and Machine learning “High-level

Markov decision processes:

Adversarial games: against opponent (e.g., chess)

Reflex States Variables

“Low-level AI and Machine learning “High-level

order of filling squares doesn’t matter in the evaluation criteria

Bayesian networks: soft dependencies (where variables are random

Reflex States Variables Logic

Tell information Ask questions

Use natural language

Algorithmic tool: dynamic programming

Continuous optimization: find the best vector of real numbers

Algorithmic tool: gradient descent

You might also like