Professional Documents
Culture Documents
Lecture 01 Introducing ML 13102022 031101pm
Lecture 01 Introducing ML 13102022 031101pm
Test/Quiz…………………………10%
Mid-Term…………………………30%
Final Examination......................…….40%
3
Text Books and Reading Material
Machine Learning, Tom Mitchell, McGraw-Hill.
Pattern Recognition and Machine Learning,
Christopher M. Bishop
Machine Learning: a Probabilistic Perspective,
Kevin Murphy
100 pages of Machine Learning
Peter Flach, Machine Learning: The art and
science of algorithms that make sense of data.
Cambridge University Press, 2012.
Outline
1. ML in a Nutshell
2. Representation, Evaluation, Optimization
3. Types of Learning
4. Trade-offs in Machine Learning
Machine Learning
5
Decision trees
Sets of rules / Logic programs
Instances
Graphical models (Bayes/Markov nets)
Neural networks
Support vector machines
Evaluation
9
Accuracy
Precision and recall
Squared error
Likelihood
Posterior probability
Cost / Utility
Margin
Entropy
Optimization
10
Combinatorial optimization
E.g.: Greedy search
Convex optimization
E.g.: Gradient descent
Constrained optimization
E.g.: Linear programming
Types of Learning
11
Supervised Unsupervised
K-Means EM Self-Organizing
Linear Nonlinear Maps
• Domain-knowledge
‐ vs. data-driven
‐
Quantitative
Prediction
Discrete Prediction
Dimensionality reduction
https://techvidvan.com/tutorials/reinforcement-learning/
Key Issues in Machine Learning
Modeling
How to formulate application problems as machine learning problems ? How to
represent the data?
Learning Protocols (where is the data & labels coming from?)
Representation
What functions should we learn (hypothesis spaces) ?
How to map raw input to an instance space?
Any rigorous way to find these? Any general approach?
Algorithms
What are good algorithms?
How do we define success?
Generalization Vs. over fitting
The computational problem
Machine Learning as a Process
- Normalization
- Transformation
- Missing Values
- Outliers
Collectively all columns known as features / feature space and total dimension of the data
ML for Images and Videos
A key attribute of images data type is the presence
of spatial features/relationships within images that Unstructured Data : This data is usually
need to be understood to extract insightful composed of everything else including texts, images,
videos, speech/audio, time series/
information from the images.
Each image (greyscale) is a 2D data which can be
represented as a matrix
https://ailephant.com/tag/convolutional-neural-network/
ML for Audio / Time Series
This type of data has a sequence of ordered data points each having a timestamp.
The most salient feature in this data is the relationship between the different data points such as periodic
patterns, seasonal behaviors, and so on.
For example, if you consider the temperature recorded in a city over last year, looking at the changes over time,
we can easily identify that winter months are colder and summer months are hotter.
This type of insight is basic but can only be observed if you look at the data points with their timestamps. Figure
2 shows an example visualization of time series data.
ML for Heterogeneous Data
Multimodal Learning
+
/ Fusion of different
features / modals
Notation: Scalars, Vectors
A scalar is a simple numerical value, like 15 or - 3.25.
Variables or constants that take scalar values are denoted by an italic letter, like x or a
A vector is an ordered list of scalar values, called attributes. We denote a vector as a bold character, for example, x or w.
Vectors can be visualized as arrows that point to some directions as well as points in a multi-dimensional space.
Illustrations of three two-dimensional vectors, a = [2, 3], b = [-2, 5], and c = [1, 0] is given in fig. 1
Operations on Sets
A derived set creation operator looks like this: .
This notation means that we create a new set S’ by putting into it x squared such that that x is in S, and x is greater than 3.
Capital Pi Notation
A notation analogous to capital sigma is the capital pi notation. It denotes a product of elements in a collection
or attributes of a vector
Where a.b means ‘a’ multiply by ‘b’. Even if written ab its same meanings
Operations on Vectors
The two vectors must be of the same dimensionality. Otherwise, the dot-product is undefined
When the vector is on the left side of the matrix in the multiplication, then it has to be transposed before we multiply
it by the matrix. The transpose of the vector x denoted as makes a row vector out of a column vector. Let’s
say,
then
we can only multiply a vector by a matrix if the vector has the same number of dimensions as the number of rows
in the matrix.
Derivative and Gradient
A derivative f’ of a function f is a function or a value that describes how fast f grows (or decreases).
If the derivative is a constant value, like 5 or -3, then the function grows (or decreases) constantly at any
point x of its domain.
If the derivative f’ is a function, then the function f can grow at a different regions
If the derivative f’ is positive at some point x, then the function f grows at this point.
If the derivative of f is negative at some x, then the function decreases at this point.
Gradient is the generalization of derivative for functions that take several inputs (or one input in
the form of a vector or some other complex structure). A gradient of a function is a vector of
partial derivatives.
Few other concepts
A random variable, usually written as an italic capital letter, like X, is a variable whose possible values are
numerical outcomes of a random phenomenon. There are two types of random variables: discrete and
continuous
The probability distribution of a discrete random variable is described by a list of probabilities associated
with each of its possible values.
A continuous random variable takes an infinite number of possible values in some interval. Examples include
height, weight, and time
Max and Arg Max, in which the Max operator return highest value of a function and
Arg Max return the elements of any set that maximize the function
Summary
Learning?
Applications of Machine Learning
Representation, Evaluation, Optimization
Types of Learning
Trade-offs in Machine Learning