You are on page 1of 15

10/23/2023

Introduction to
Machine Learning
Dr. Muhammad Amjad Iqbal
Associate Professor
University of Central Punjab, Lahore.
Amjad.iqbal@ucp.edu.pk

https://sites.google.com/a/ucp.edu.pk/iml/
Slides of Prof. Dr. Andrew Ng, Stanford

Assessment
Scale: Bad, Average, Good, Very Good
• How good are you in programming
– think of “Data structures” course?
• How good are you in Mathematics?
– think of “Discrete Structures” course?
• How good are you in Probability and Statistics?

Course Logistics
• Two lectures per week

• Quizzes (15% marks)


• Assignments (15% marks)
• Project (10% marks)
• 2 exams
– Mid Term Exam (20% marks)
– Final Exam (40% marks)
• Covers all course
3

1
10/23/2023

Course Logistics Cont.


• Course material will be posted on:
https://cms.ucp.edu.pk/

• What is on the website:


– Course Handbook, Lectures, Assignments

Visit the website frequently

Course Logistics Cont.


• Plagiarism
– Copying someone else’s work (partial or complete) and
submitting it as if it were one’s own
– Read course handbook to know more about plagiarism
– Zero tolerance for plagiarism
• You’ll upload Assignments in MS Teams or something
similar
– Assignments will be checked for plagiarism using Moss
or JPlag
– If Plagiarism found, all involved will get zero in that
assignment
5

Course Objectives
• To introduce the basic concepts of Machine
Learning.
• To make students understand the use of machine
learning approaches to solve some laboratory
problems initially and real world problems later on.
• To equip students with structures and strategies for
complex problem solving

• Learning by doing it using Python.


• To Excite you about the field
6

2
10/23/2023

Reference Books
• No single textbook

• Witten, Frank and Hall. Data Mining - Practical


Machine Learning Tools and Techniques 3rd Edition
• Christopher M. Bishop. Pattern Recognition and
Machine Learning Springer
• Ethem Alpaydin. Introduction to Machine Learning 2nd
Edition
• T. Mitchell. Machine Learning. WCB/McGraw-Hill,
Boston, 1997. 7

Reference Books
• Stuart Russell and Peter Norvig. Artificial
Intelligence A Modern Approach – 3rd edition

• David Barber. Bayesian Reasoning and Machine


Learning

• Data Mining A Knowledge Discovery Approach –


Springer
• All available in pdf at ucpshares
8

Motivation
• Machine Learning is one of the most exiting
area
• Its everywhere

SPAM
9

3
10/23/2023

www.imdb.com

www.amazon.com

Machine Learning
• Grew out of work in AI
• Aim: building intelligent machines
• What we knew already: Program a machine to find the
shortest path from A to B (for example)
• Did not know much: How to write AI programs that can
do more interesting things like web search, photo
tagging or email anti-spam, driverless car, etc.
• Realization: Machine learns to do it by itself
• Machine learning was developed as a new capability
for computers
• Today it touches many segments of industry and
science
12

4
10/23/2023

ML application areas: Database mining


• Large datasets from growth of automation/web
• One of the reasons that ML becomes so wide
spread
– Web click data
• Tons of companies collecting clickstream data for
mining purpose
• To understand the users better with machine
learning algorithms
• Huge segment of Software Industry working on it
currently 13

ML application areas: Database mining

• Electronic Medical records


– Trying to turn medical records into medical
knowledge, to understand disease better.

• An evaluation of machine-learning methods for


predicting pneumonia mortality
– G. F. Cooper et al. 1997
• Artificial Intelligence in Medicine, 9(2) 107-138
14

ML application areas: Database mining

• Computational biology
– Biologists collecting lots of data about gene
sequences, DNA sequences, etc.
– ML algorithms are giving us a much better
understanding of the human genome
• Engineering

15

5
10/23/2023

ML application areas:
• Applications we can’t program by hand.
– Autonomous helicopter, Google driverless car
• Learns to do it by itself
– Handwriting recognition
• Postal Mail: A learning algorithm that has learned how to
read postal code in your handwriting (US mail)
– Most of Natural Language Processing (NLP)
and Computer Vision today
• Applied Machine learning
16

ML application areas:
• Self-customizing programs
– Amazon, IMDB, Youtube recommendations
• Understanding human learning (brain, cognition)
– Learning algorithms are being used today to
understand human learning and to
understand the brain.

17

Machine learning is a highly desirable skill in IT


industry and Computer Science research

18

6
10/23/2023

Machine Learning definition


• Arthur Samuel (1959). Machine Learning:
• Field of study that gives computers the ability to
learn without being explicitly programmed.
• Tom Mitchell (1998) Well-posed Learning Problem:
A computer program is said to learn from
experience E with respect to some task T and some
performance measure P, if its performance on T, as
measured by P, improves with experience E.

19

“A computer program is said to learn from experience E with


respect to some task T and some performance measure P, if
its performance on T, as measured by P, improves with
experience E.”
Suppose your email program watches which emails you do or do
not mark as spam, and based on that learns how to better filter
spam. What is the task T in this setting?

Classifying emails as spam or not spam. T

Watching you label emails as spam or not spam. E

The number (or fraction) of emails correctly classified as spam/not spam. P

None of the above—this is not a machine learning problem.

Topics
Machine learning algorithms: Background Topics:
- Supervised learning - Linear Algebra
- Unsupervised learning - Probability
- Reinforcement learning
- Bayesian Networks
- Hidden Markov Models

Also talk about: Practical advice for applying learning


algorithms. Implementation in Octave
21

7
10/23/2023

Supervised Learning
• Probably the most common type of machine
learning problem
• Let us introduce it with an example

22

Housing price prediction.


400

300 quadratic function or


second-order polynomial
Price ($)
in 1000’s 200

100

0
0 500 1000 1500 2000 2500
750
Size in feet2

Supervised Learning Regression: Predict continuous


“right answers” given valued output (price)
23

Another supervised learning example


Cancer (malignant, benign)

24

8
10/23/2023

Another supervised learning example


Cancer (malignant, benign)

Classification: Discrete Valued output (0 or 1)


0, 1, 2, 3, …
Benign, T1, T2, T3, … 25

Slightly different set of symbols to plot this data with 2 features

Features
- Tumor Thickness
- Uniformity of Cell Size
- Uniformity of Cell Shape

26

What do we do if we have infinite number of


features?
Support Vector Machine (SVM) algorithm can
ideally deal with infinite number of features, with
a neat mathematical trick

In supervised learning:
• In every example in our data set, we are told
what is the "correct answer”.
• Data is labeled with answers
27

9
10/23/2023

• Classification: the output is binary or a fixed


number of features. Ex. something is either a
chair or not.
• Regression is continuous. Ex. Tomorrow’s
temperature might be 13 degrees in our
prediction.

28

Problem 1: You have a large inventory of identical items.


You want to predict how many of these items will sell over
the next 3 months.
Problem 2: You’d like software to examine individual
customer accounts, and for each account decide if it has
been hacked/compromised.
Classification or regression problems?
1. Treat both as classification problems.
2. Treat problem 1 as a classification problem, problem 2
as a regression problem.
3. Treat problem 1 as a regression problem, problem 2
as a classification problem.
4. Treat both as regression problems.

Supervised Learning Topics


• Regression
• Support Vector Machines
• Neural Networks
• Bayesian Learning
• K Nearest Neighbors
• Decision Trees
• etc.

10
10/23/2023

Unsupervised Learning
• Data without “right answers”
• Data doesn't have any labels

• We're just told, here is a data set!


• Can you find some structure in the data?

31

Supervised Learning

x2

x1
32

Unsupervised Learning

x2

A clustering algorithm

x1
33

11
10/23/2023

34

35
Genes

Individuals
DNA microarray data to understand genomics
Colors show the degree to which different individuals do or do not have a
specific gene.

36
[Source: Daphne Koller]

12
10/23/2023

Genes

Individuals

• Cluster individuals into different categories or into different types of people.

37
[Source: Daphne Koller]

Organize computing clusters

• Figure out which machines tend to work together


using clustering algorithm
• Then put those machines together, to make data
center work more efficiently.

38

Social network analysis

Market segmentation

Image credit: NASA/JPL-Caltech/E. Churchwell (Univ. of Wisconsin, Madison)

Astronomical data analysis 39

13
10/23/2023

Of the following examples, which would you address


using an unsupervised learning algorithm?
(select all that apply.)

1. Given email labeled as spam/not spam, learn a spam filter.


2. Given a set of news articles found on the web, group them
into set of articles about the same story.
3. Given a database of customer data, automatically discover
market segments and group customers into different market
segments.
4. Given a dataset of patients diagnosed as either having
diabetes or not, learn to classify new patients as having
diabetes or not.

Unsupervised Learning
• K-means Clustering
• A-priori Algorithm
• Self-organizing Maps

Reinforcement learning
• Refers to problems where we don't do one-shot
decision-making
• E.g., in the supervised learning cancer prediction
problem, we have a patient. We predict if tumor
is malignant or benign. Later we’ll know either we
got it right or wrong.
• In reinforcement learning problems, we usually
have to make a sequence of decisions over time.
42

14
10/23/2023

Reinforcement learning
• Examples: autonomous helicopter, driverless car
• Cannot program by hand
• Stochastic environment: too many possibilities

• The basic idea is to define a “reward function”

• Q Learning

43

Probabilistic reasoning
• Turing award (Nobel prize in Computer Science)
for Bayesian networks

44

END

45

15

You might also like