You are on page 1of 134

MACHINE LEARNING

PALLAVI SHUKLA
Assistant Professor
United College Of Engineering & Research, Prayagraj
UNIT 1 - INTRODUCTION
Learning, Types of Learning,
Well-defined learning problems,
Designing a Learning System,
History of ML, Introduction of Machine Learning Approaches – (Artificial Neural
Network,
Clustering, Reinforcement Learning,
Decision Tree Learning,
Bayesian networks, Support Vector Machine,
Genetic Algorithm),
Issues in Machine Learning and Data Science vs. Machine Learning;
What is Human learning?

• Human learning is all about observing things, recognizing a pattern,


elaborating a theory or model that explains that pattern, and then putting
that theory to the test and checking whether it matches most or all of the
observations.
• Learning is, basically, a model that represents a pattern within a collection
of observations.
• Without a feasible model, there is no learning
Types of Human Learning :

• (1) either somebody who is an expert in the subject directly teaches us,
• (2) we build our own notion indirectly based on what we have learned
from the expert in the past,
• (3) we do it ourselves, maybe after multiple attempts, some being
unsuccessful.
Learning under Expert Guidance -
• An infant may inculcate certain traits and characteristics, learning straight
from its guardians.
• He calls his hand, a ‘hand’, because that is the information he gets from
his parents.
• The sky is ‘blue’ to him because that is what his parents have taught him.
• We say that the baby ‘learns’ things from his parents.
Learning guided by knowledge gained from
experts -
Learning guided by knowledge gained from
experts -
• In all these situations, there is no direct learning. It is some past information
shared on some different context, which is used as a learning to make
decisions
Learning by Shelf -
Learning by Shelf -

• In many situations, humans are left to learn on their own.


• A lot of things need to be learned only from mistakes made in the past.
• We tend to form a check list on things that we should do, and things that
we should not do, based on our experiences
Machine Learning:

• Machine learning is a branch of artificial intelligence


(AI) and computer science that focuses on the use
of data and algorithms to imitate the way that
humans learn, gradually improving its accuracy
• Machine Learning is the field of study that gives
computers the capability to learn without being
explicitly programmed. (‘A computer program is said
to learn from experience E with respect to some class
of tasks T and performance measure P, if its
performance at tasks in T, as measured by P,
improves with experience E.’)
Examples –
Handwriting recognition learning problem –
• Task T: Recognizing and classifying handwritten words within images
• Performance P: Percent of words correctly classified
• Training experience E: A dataset of handwritten words with given
classifications
A robot driving learning problem -
• Task T: Driving on highways using vision sensors
• Performance measure P: Average distance traveled before an error
• Training experience: A sequence of images and steering commands
recorded while observing a human driver.
A chess learning problem -

• Task T: Playing chess.


• Performance measure P: Percent of games won against opponents.
• Training experience E: Playing practice games against itself.
• A computer program which learns from experience is called a machine
learning program or simply a learning program.
• Such a program is sometimes also referred to as a learner.
How do machines learn?
The basic Machine Learning process can be divided into four parts.
1. Data Input: Past data or information is utilized as a basis for future decision-
making.
2. Abstraction: The input data is represented in a broader way through the
underlying algorithm.
3. Generalization: The abstracted representation is generalized to form a
framework for making decisions.
4. Evaluation: provides a feedback mechanism to measure the utility of
learned knowledge and inform potential improvements.
Machine Learning Process -
Data Storage:
• Facilities for storing and retrieving huge amounts of data are an important
component of the learning process. Humans and computers alike utilize
data storage as a foundation for advanced reasoning.
• In a human being, the data is stored in the brain, and data is retrieved
using electrochemical signals.
• Computers use hard disk drives, flash memory, random access memory,
and similar devices to store data and use cables and other technology to
retrieve data.
Abstraction:
• Abstraction is the process of extracting knowledge about stored data.
• This involves creating general concepts about the data as a whole.
• The creation of knowledge involves the application of known models and the
creation of new models.
• The process of fitting a model to a dataset is known as training.
• When the model has been trained, the data is transformed into an abstract
form that summarizes the original information.
• This work of assigning a broader meaning to stored data occurs during the
abstraction process, in which raw data comes to represent a wider, more
abstract concept or idea.
• This type of connection, say between an object and its representation, is
exemplified by the famous René Magritte painting The Treachery of Images.
• There are many different types of models. You may already be familiar
with some. Examples include:
• Mathematical equations
• Relational diagrams, such as trees and graphs
• Logical if/else rules
• Groupings of data known as clusters.

• The process of fitting a model to a dataset is known as training. When the model
has been trained, the data has been transformed into an abstract form that
summarizes the original information.
Generalization -
• The term generalization describes the process of turning the knowledge
about stored data into a form that can be utilized for future action.
• These actions are to be carried out on tasks that are similar, but not
identical, to those that have been seen before.
• In generalization, the goal is to discover those properties of the data that
will be most relevant to future tasks.
• The term generalization is defined as the process of turning abstracted
knowledge into a form that can be utilized for future action, on tasks that
are similar, but not identical, to those the learner has seen before.
• It acts as a search through the entire set of models (that is, theories or
inferences) that could be established from the data during training.
• In generalization, the learner is tasked with limiting the patterns it discovers
to only those that will be most relevant to its future tasks.
• Normally, it is not feasible to reduce the number of patterns by examining
them one by one and ranking them by future utility.
• Instead, machine learning algorithms generally employ shortcuts that
reduce the search space more quickly.
• To this end, the algorithm will employ heuristics, which are educated
guesses about where to find the most useful inferences.
Evaluation -
• Evaluation is the last component of the learning process.
• It is the process of giving feedback to the user to measure the utility of the
learned knowledge.
• This feedback is then utilized to effect improvements in the whole learning
process.
• The final step in the learning process is to evaluate its success and to
measure the learner's performance in spite of its biases.
• The information gained in the evaluation phase can then be used to
inform additional training if needed.
• Generally, evaluation occurs after a model has been trained on an initial
training dataset.
• Then, the model is evaluated on a separate test dataset in order to judge
how well its characterization of the training data generalizes to new,
unseen cases.
• It's worth noting that it is exceedingly rare for a model to perfectly
generalize to every unforeseen case—mistakes are almost always
inevitable.
Well-posed learning problem:
• For defining a new problem, which can be solved using machine
learning, a simple framework, highlighted below, can be used.
• This framework also helps in deciding whether the problem is a right
candidate to be solved using machine learning.
• The framework involves answering three questions:
• 1. What is the problem?
• 2. Why does the problem need to be solved?
• 3. How to solve the problem?
• Step 1: What is the Problem?
• A number of information should be collected to know what is the problem.
• Informal description of the problem, e.g.
• I need a program that will prompt the next word as and when I type a word.
• Formalism Use Tom Mitchell’s machine learning formalism stated above to define the T,
P, and E for the problem.
• For example: Task (T): Prompt the next word when I type a word.
Experience (E): A corpus of commonly used English words and phrases.
Performance (P): The number of correct words prompted considered as a
percentage (which in machine learning paradigm is known as learning accuracy).
Assumptions - Create a list of assumptions about the problem.
Similar problems What other problems have you seen or can you think of that are similar
to the problem that you are trying to solve?
Step 2: Why does the problem need to be
solved?
Motivation
• What is the motivation for solving the problem?
• What requirement will it fulfill?
• For example, does this problem solve any long-standing business issue like
finding out potentially fraudulent transactions?
• Or the purpose is more trivial like trying to suggest some movies for the
upcoming weekend.
Step 3: How would I solve the problem?
• Try to explore how to solve the problem manually.
• Detail out step-by-step data collection, data preparation, and program
design to solve the problem.
• Collect all these details and update the previous sections of the problem
definition, especially the assumptions.
Introduction to ML -
PALLAVI SHUKLA
Assistant professor
Applications of Machine Learning-

• Application of machine learning methods to large databases is called data mining.


• In data mining, a large volume of data is processed to construct a simple model with
valuable use, for example, having high predictive accuracy.
• The following is a list of some of the typical applications of machine learning.
1. In retail business, machine learning is used to study consumer behavior.
2. In finance, banks analyze their past data to build models to use in credit applications,
fraud detection, and the stock market.
3. In manufacturing, learning models are used for optimization, control, and
troubleshooting
4. In medicine, learning programs are used for medical diagnosis.
5. In telecommunications, call patterns are analyzed for network optimization and
maximizing the quality of service.
• In science, large amounts of data in physics, astronomy, and biology can
only be analyzed fast enough by computers.
• The World Wide Web is huge; it is constantly growing and searching for
relevant information cannot be done manually.
• In artificial intelligence, it is used to teach a system to learn and adapt to
changes so that the system designer does not foresee and provide
solutions for all possible situations.
• It is used to find solutions to many problems in vision, speech recognition,
and robotics.
• Machine learning methods are applied in the design of computer-
controlled vehicles to steer correctly when driving on a variety of roads.
• Machine learning methods have been used to develop programs for
playing games such as chess, backgammon, and Go.
History of ML:
• 1950s
– Samuel’s checker player
– Selfridge’s Pandemonium
• 1960s:
– Neural networks: Perceptron
– Pattern recognition
– Learning in the limit theory
– Minsky and Papert prove limitations of Perceptron
• 1970s:
– Symbolic concept induction
– Winston’s arch learner
– Expert systems and the knowledge acquisition bottleneck
– Quinlan’s ID3
– Michalski’s AQ and soybean diagnosis
– Scientific discovery with BACON
– Mathematical discovery with AM
History of ML:
• 1980s:
– Advanced decision tree and rule learning
– Explanation-based Learning (EBL)
– Learning and planning and problem solving
– Utility problem
– Analogy
– Cognitive architectures
– Resurgence of neural networks (connectionism,
backpropagation)
– Valiant’s PAC Learning Theory
– Focus on experimental methodology
• 1990s
– Data mining
– Adaptive software agents and web applications
– Text learning
– Reinforcement learning (RL)
– Inductive Logic Programming (ILP)
– Ensembles: Bagging, Boosting, and Stacking
– Bayes Net learning
History of ML:
• 2000s
– Support vector machines & kernel methods
– Graphical models
– Statistical relational learning
– Transfer learning
– Sequence labeling
– Collective classification and structured outputs
– Computer Systems Applications (Compilers, Debugging, Graphics, Security)
– E-mail management
– Personalized assistants that learn
– Learning in robotics and vision
• 2010s
– Deep learning systems
– Learning for big data
– Bayesian methods
– Multi-task & lifelong learning
– Applications to vision, speech, social networks, learning to read, etc.
TYPES OF MACHINE LEARNING -
Supervised Learning -
• It has the presence of a supervisor as a teacher.
• Task of learning a function that maps an input to an output based on
example input-output pairs.
• A training set of examples with the correct responses (targets) is provided
and, based on this training set, the algorithm generalizes to respond
correctly to all possible inputs. This is also called learning from exemplars.
• A supervised learning algorithm analyzes the training data and produces
a function, which can be used for mapping new examples.
• If the shape of the object is rounded and has a depression at the
top, is red in color, then it will be labeled as –Apple.
• If the shape of the object is a long curving cylinder having Green-
Yellow color, then it will be labeled as –Banana.
Types of Supervised Learning -
Classification:
• are used to predict/Classify the discrete values such as Male or Female, True
or False, Spam or Not Spam, etc.
• a computer program is trained on the training dataset and based on that
training, it categorizes the data into different classes.
Regression :
• are used to predict the continuous values such as price, salary, age, etc.
• finding the correlations between dependent and independent variables.
Advantages of Supervised Algorithm -

• Supervised learning allows collecting data and produces data output from
previous experiences.
• Helps to optimize performance criteria with the help of experience.
• Supervised machine learning helps to solve various types of real-world
computation problems.
• It performs classification and regression tasks.
• It allows estimating or mapping the result to a new sample.
• We have complete control over choosing the number of classes we want in
the training data.
Disadvantages Of Supervised Algorithm -

• Classifying big data can be challenging.


• Training for supervised learning needs a lot of computation time. So, it requires
a lot of time.
• Supervised learning cannot handle all complex tasks in Machine Learning.
• Computation time is vast for supervised learning.
• It requires a labeled data set.
• It requires a training process.
Introduction to ML -
PALLAVI SHUKLA
Assistant professor
Applications of Machine Learning-

• Application of machine learning methods to large databases is called data mining.


• In data mining, a large volume of data is processed to construct a simple model with
valuable use, for example, having high predictive accuracy.
• The following is a list of some of the typical applications of machine learning.
1. In retail business, machine learning is used to study consumer behavior.
2. In finance, banks analyze their past data to build models to use in credit applications,
fraud detection, and the stock market.
3. In manufacturing, learning models are used for optimization, control, and
troubleshooting
4. In medicine, learning programs are used for medical diagnosis.
5. In telecommunications, call patterns are analyzed for network optimization and
maximizing the quality of service.
• In science, large amounts of data in physics, astronomy, and biology can
only be analyzed fast enough by computers.
• The World Wide Web is huge; it is constantly growing and searching for
relevant information cannot be done manually.
• In artificial intelligence, it is used to teach a system to learn and adapt to
changes so that the system designer does not foresee and provide
solutions for all possible situations.
• It is used to find solutions to many problems in vision, speech recognition,
and robotics.
• Machine learning methods are applied in the design of computer-
controlled vehicles to steer correctly when driving on a variety of roads.
• Machine learning methods have been used to develop programs for
playing games such as chess, backgammon, and Go.
History of ML:
• 1950s
– Samuel’s checker player
– Selfridge’s Pandemonium
• 1960s:
– Neural networks: Perceptron
– Pattern recognition
– Learning in the limit theory
– Minsky and Papert prove limitations of Perceptron
• 1970s:
– Symbolic concept induction
– Winston’s arch learner
– Expert systems and the knowledge acquisition bottleneck
– Quinlan’s ID3
– Michalski’s AQ and soybean diagnosis
– Scientific discovery with BACON
– Mathematical discovery with AM
History of ML:
• 1980s:
– Advanced decision tree and rule learning
– Explanation-based Learning (EBL)
– Learning and planning and problem solving
– Utility problem
– Analogy
– Cognitive architectures
– Resurgence of neural networks (connectionism,
backpropagation)
– Valiant’s PAC Learning Theory
– Focus on experimental methodology
• 1990s
– Data mining
– Adaptive software agents and web applications
– Text learning
– Reinforcement learning (RL)
– Inductive Logic Programming (ILP)
– Ensembles: Bagging, Boosting, and Stacking
– Bayes Net learning
History of ML:
• 2000s
– Support vector machines & kernel methods
– Graphical models
– Statistical relational learning
– Transfer learning
– Sequence labeling
– Collective classification and structured outputs
– Computer Systems Applications (Compilers, Debugging, Graphics, Security)
– E-mail management
– Personalized assistants that learn
– Learning in robotics and vision
• 2010s
– Deep learning systems
– Learning for big data
– Bayesian methods
– Multi-task & lifelong learning
– Applications to vision, speech, social networks, learning to read, etc.
TYPES OF MACHINE LEARNING -
Supervised Learning -
• It has the presence of a supervisor as a teacher.
• Task of learning a function that maps an input to an output based on
example input-output pairs.
• A training set of examples with the correct responses (targets) is provided
and, based on this training set, the algorithm generalizes to respond
correctly to all possible inputs. This is also called learning from exemplars.
• A supervised learning algorithm analyzes the training data and produces
a function, which can be used for mapping new examples.
• If the shape of the object is rounded and has a depression at the
top, is red in color, then it will be labeled as –Apple.
• If the shape of the object is a long curving cylinder having Green-
Yellow color, then it will be labeled as –Banana.
Types of Supervised Learning -
Classification:
• are used to predict/Classify the discrete values such as Male or Female, True
or False, Spam or Not Spam, etc.
• a computer program is trained on the training dataset and based on that
training, it categorizes the data into different classes.
Regression :
• are used to predict the continuous values such as price, salary, age, etc.
• finding the correlations between dependent and independent variables.
Advantages of Supervised Algorithm -

• Supervised learning allows collecting data and produces data output from
previous experiences.
• Helps to optimize performance criteria with the help of experience.
• Supervised machine learning helps to solve various types of real-world
computation problems.
• It performs classification and regression tasks.
• It allows estimating or mapping the result to a new sample.
• We have complete control over choosing the number of classes we want in
the training data.
Disadvantages Of Supervised Algorithm -

• Classifying big data can be challenging.


• Training for supervised learning needs a lot of computation time. So, it requires
a lot of time.
• Supervised learning cannot handle all complex tasks in Machine Learning.
• Computation time is vast for supervised learning.
• It requires a labeled data set.
• It requires a training process.
Introduction to
Machine Learning –
Lecture 3
PALLAVI SHUKLA
Assistant professor
Unsupervised Learning -

• It is the training of a machine using information that is neither classified nor


labeled and allowing the algorithm to act on that information without guidance.

• The task of the machine is to group unsorted information according to


similarities, patterns, and differences without any prior training of data.

• Type of machine learning algorithm used to draw inferences from


datasets consisting of input data without labeled responses.
Types of Unsupervised Learning -
Clustering -
• A clustering problem is where you want to discover the inherent groupings in the
data, such as grouping customers by purchasing behavior.
• is a method of grouping the objects into clusters such that objects with most
similarities remains into a group and has less or no similarities with the objects
of another group.
Association -
• It is used for finding the relationships between variables in the large
database.

• It determines the set of items that occurs together in the dataset. Association
rule makes marketing strategy more effective.
Advantages of Unsupervised Learning:
• It does not require training data to be labeled.
• Dimensionality reduction can be easily accomplished using unsupervised learning.
• Capable of finding previously unknown patterns in data.
• Flexibility: Unsupervised learning is flexible in that it can be applied to a wide
variety of problems, including clustering, anomaly detection, and association rule
mining.
• Exploration: Unsupervised learning allows for the exploration of data and the
discovery of novel and potentially useful patterns that may not be apparent from the
outset.
• Low cost: Unsupervised learning is often less expensive than supervised learning
because it doesn’t require labeled data, which can be time-consuming and costly to
obtain.
Disadvantages of Unsupervised Learning :
• Difficult to measure accuracy or effectiveness due to lack of predefined answers during
training.
• The results often have lesser accuracy.
• The user needs to spend time interpreting and label the classes which follow that
classification.
• Lack of guidance: Unsupervised learning lacks the guidance and feedback provided by
labeled data, which can make it difficult to know whether the discovered patterns are
relevant or useful.
• Sensitivity to data quality: Unsupervised learning can be sensitive to data quality,
including missing values, outliers, and noisy data.
• Scalability: Unsupervised learning can be computationally expensive, particularly for
large datasets or complex algorithms, which can limit its scalability.
REINFORCEMENT LEARNING -
• It is the problem of getting an agent to act in the world so as to maximize
its rewards.
• A learner (the program) is not told what actions to take as in most forms
of machine learning, but instead must discover which actions yield the
most reward by trying them .
Example-
Application of Reinforcement Learning -

• 1. Robotics: Robots with pre-programmed behavior are useful in


structured environments, such as the assembly line of an automobile
manufacturing plant, where the task is repetitive in nature.
• 2. A master chess player makes a move. The choice is informed both by
planning, anticipating possible replies and counter replies.
• 3. An adaptive controller adjusts parameters of a petroleum refinery’s
operation in real time.
Advantages of Reinforcement Learning-
• 1. Can be used to solve very complex problems that cannot be solved by conventional
techniques.
• 2. The model can correct the errors that occurred during the training process.
• 3. In RL, training data is obtained via the direct interaction of the agent with the
environment
• 4. Can handle environments that are non-deterministic, meaning that the outcomes of
actions are not always predictable. This is useful in real-world applications where the
environment may change over time or is uncertain.
• 5. Can be used to solve a wide range of problems, including those that involve decision-
making, control, and optimization.
• 6. A flexible approach that can be combined with other machine learning techniques,
such as deep learning, to improve performance.
Disadvantages of Reinforcement Learning -

• 1. It is not preferable to use it for solving simple problems.


• 2. It needs a lot of data and a lot of computation.
• 3. It is highly dependent on the quality of the reward function. If the
reward function is poorly designed, the agent may not learn the desired
behavior.
• 4. It can be difficult to debug and interpret. It is not always clear why the
agent is behaving in a certain way, which can make it difficult to diagnose
and fix problems.
Introduction to
machine learning
approaches- Lecture 4
Pallavi Shukla
Assistant professor
Computer science & engineering
Introduction to Machine Learning Approaches-

• Artificial Neural Network


• Clustering
• Reinforcement Learning
• Decision Tree Learning
• Bayesian Networks
• Support Vector Machine
• Genetic Algorithm
Artificial Neural Network(ANN):

• It is an information processing paradigm that is inspired by the way


the biological nervous system such as the brain process information.
• It is composed of large number of highly interconnected processing
elements(neurons) working in unison to solve a specific problem
• Biological Neurons (also called nerve cells) or simply neurons are the
fundamental units of the brain and nervous system, the cells
responsible for receiving sensory input from the external world via
dendrites, process it and gives the output through Axons.
• Cell body (Soma): The body of the neuron cell contains the nucleus
and carries out biochemical transformation necessary to the life of
neurons.
• Dendrites: Each neuron has fine, hair-like tubular structures
(extensions) around it. They branch out into a tree around the cell
body. They accept incoming signals.
• Axon: It is a long, thin, tubular structure that works like a transmission
line.
• Synapse: Neurons are connected to one another in a complex
spatial arrangement. When axon reaches its final destination, it
branches again called terminal arborization. At the end of the axon
are highly complex and specialized structures called synapses. The
• Dendrites receive input through the synapses of other
neurons.
• The soma processes these incoming signals over time and
converts that processed value into an output, which is sent
out to other neurons through the axon and the synapses.
• The following diagram represents the general model of ANN
which is inspired by a biological neuron. It is also called
Perceptron.
• A single layer neural network is called a Perceptron.
• It gives a single output.
• In the above figure, for one single observation, x0, x1, x2, x3...x(n)
represents various inputs (independent variables) to the network.
• Each of these inputs is multiplied by a connection weight or
synapse.
• The weights are represented as w0, w1, w2, w3…. w(n).
• Weight shows the strength of a particular node. b is a bias value.
• A bias value allows you to shift the activation function up or down.
• In the simplest case, these products are summed, fed to a
transfer function (activation function) to generate a result,
and this result is sent as output.
• Mathematically, x1.w1 + x2.w2 + x3.w3 ...... xn. wn = ∑ xi. Wi
• Now activation function is applied 𝜙 (∑ xi. wi)
• The Activation function is important for an ANN to learn and
make sense of something really complicated. Their main
purpose is to convert an input signal of a node in an ANN to
an output signal. This output signal is used as input to the next
layer in the stack
CLUSTERING -

• Clustering is the task of dividing the population or data


points into a number of groups such that data points in
the same groups are more similar to other data points in
the same group than those in other groups.
• In simple words, the aim is to segregate groups with
similar traits and assign them into clusters.
Example of Clustering -

• Suppose, you are the head of a rental store and wish to understand
the preferences of your costumers to scale up your business.
• Is it possible for you to look at details of each costumer and devise
a unique business strategy for each one of them?
• Definitely not.
• But, what you can do is to cluster all of your customers into say 10
groups based on their purchasing habits and use a separate
strategy for customers in each of these 10 groups. And this is what
we call clustering.
Types of Clustering:

• Hard Clustering: In hard clustering, each data point either


belongs to a cluster completely or not. For example, in the
above example, each customer is put into one group out of
the 10 groups.

• Soft Clustering: In soft clustering, instead of putting each data


point into a separate cluster, a probability or likelihood of that
data point to be in those clusters is assigned. For example,
from the above scenario, each customer is assigned a
probability to be in either of 10 clusters of the retail store.
Decision tree -

• It is like a tree structure that works on the principle of conditions.


• It is efficient and has strong algorithms used for predictive analysis.
• It has mainly attributes that include internal nodes, branches and a terminal node.
• Every internal node holds a “test” on an attribute, branches hold the conclusion of the test and
every leaf node means the class label.
• This is the most used algorithm when it comes to supervised learning techniques.
• It is used for both classifications as well as regression.
• It is often termed as “CART” that means Classification and Regression Tree.
• Tree algorithms are always preferred due to stability and reliability.
• Branches - Division of the whole tree is called branches.
• Root Node - Represent the whole sample that is further divided
• Splitting - Division of nodes is called splitting.
• Terminal Node - Node that does not split further is called a terminal
node.
• Decision Node - It is a node that also gets further divided into
different sub-nodes being a sub node.
• Pruning - Removal of sub nodes from a decision node.
• Parent and Child Node - When a node gets divided further then
that node is termed as parent node whereas the divided nodes or
the sub-nodes are termed as a child node of the parent node
Advantages of the Decision Tree:

1.It is simple to understand as it follows the same process which a human follow
while making any decision in real-life.
2. It can be very useful for solving decision-related problems.
3. It helps to think about all the possible outcomes for a problem.
4. There is less requirement of data cleaning compared to other algorithms.
Disadvantages of the Decision Tree:

1.The decision tree contains lots of layers, which makes it complex.


2. It may have an overfitting issue, which can be resolved using the Random Forest
algorithm.
3. For more class labels, the computational complexity of the decision tree may
increase.
Bayesian networks -

• Bayesian networks are a type of probabilistic graphical model that


uses Bayesian inference for probability computations.
• Bayesian networks aim to model conditional dependence and
causation by representing conditional dependence by edges in a
directed graph.
• Through these relationships, one can efficiently conduct inference
on the random variables in the graph through the use of factors.
• Using the relationships specified by our Bayesian network, we can
obtain a compact, factorized representation of the joint probability
distribution by taking advantage of conditional independence.
• Bayesian network is a directed acyclic graph in which each
edge corresponds to a conditional dependency, and each
node corresponds to a unique random variable.
• Formally, if an edge (A, B) exists in the graph connecting
random variables A and B, it means that P(B|A) is a factor in
the joint probability distribution, so we must know P(B|A) for all
values of B and A in order to conduct inference.
• In the above example, since Rain has an edge going into
WetGrass, it means that P(WetGrass|Rain) will be a factor,
whose probability values are specified next to the WetGrass
node in a conditional probability table.
Introduction to
machine learning
approaches(PART II)
Lecture 5
Pallavi Shukla
Assistant professor

Computer science & engineering


Support Vector Machines -

• Support Vector Machine (SVM) is a powerful machine


learning algorithm used for linear or nonlinear classification,
regression, and even outlier detection tasks.
• SVMs can be used for a variety of tasks, such as text
classification, image classification, spam
detection, handwriting identification, gene expression
analysis, face detection, and anomaly detection.
• SVMs are adaptable and efficient in a variety of applications
because they can manage high-dimensional data and
nonlinear relationships.
Support Vector Machines -

• In SVM, we plot each data item in the dataset in an N-dimensional


space, where N is the number of features/attributes in the data.
• Next, find the optimal hyperplane to separate the data.
• So, by this, you must have understood that inherently, SVM can only
perform binary classification (i.e., choose between two classes)
• The main objective of the SVM algorithm is to find the
optimal hyperplane in an N-dimensional space that can separate
the data points in different classes in the feature space.
Support Vector Machines -

• Let’s consider two independent variables x1, x2, and one dependent
variable which is either a blue circle or a red circle.
• From the figure above it’s very clear that there are multiple lines
(our hyperplane here is a line because we are considering only two
input features x1, x2) that segregate our data points or do a
classification between red and blue circles. So how do we choose
the best line or in general the best hyperplane that segregates our
data points?
Support Vector Machine Terminology -

1.Hyperplane: Hyperplane is the decision boundary that is used to separate the


data points of different classes in a feature space. In the case of linear
classifications, it will be a linear equation i.e. wx+b = 0.
2.Support Vectors: Support vectors are the closest data points to the hyperplane,
which makes a critical role in deciding the hyperplane and margin.
3.Margin: Margin is the distance between the support vector and hyperplane. The
main objective of the support vector machine algorithm is to maximize the
margin. The wider margin indicates better classification performance.
Support Vector Machine Terminology -

4. Kernel: Kernel is the mathematical function, which is used in SVM to map the original input
data points into high-dimensional feature spaces, so, that the hyperplane can be easily found out
even if the data points are not linearly separable in the original input space. Some of the common
kernel functions are linear, polynomial, radial basis function(RBF), and sigmoid.
5. Hard Margin: The maximum-margin hyperplane or the hard margin hyperplane is a hyperplane
that properly separates the data points of different categories without any misclassifications.
6. Soft Margin: When the data is not perfectly separable or contains outliers, SVM permits a soft
margin technique. Each data point has a slack variable introduced by the soft-margin SVM
formulation, which softens the strict margin requirement and permits certain misclassifications or
violations. It discovers a compromise between increasing the margin and reducing violations.
Genetic Algorithm -

• A genetic algorithm (GA) is a heuristic search algorithm used to solve


search and optimization problems. This algorithm is a subset of
evolutionary algorithms, which are used in computation. Genetic
algorithms employ the concept of genetics and natural selection to
provide solutions to problems.
• These algorithms have better intelligence than random search algorithms
because they use historical data to take the search to the best performing
region within the solution space.
• GAs are also based on the behavior of chromosomes and their genetic
structure.
• Every chromosome plays the role of providing a possible solution.
• The fitness function helps in providing the characteristics of all individuals
within the population.
• The greater the function, the better the solution
Phases of genetic algorithm :

• Initialization
• Fitness assignment
• Selection
• Reproduction
• Crossover
• Mutation:
INITIALIZATION

• The genetic algorithm starts by generating an initial


population.
• This initial population consists of all the probable
solutions to the given problem.
• The most popular technique for initialization is the use
of random binary strings.
Fitness assignment

• The fitness function helps in establishing the fitness of all


individuals in the population.
• It assigns a fitness score to every individual, which further
determines the probability of being chosen for
reproduction.
• The higher the fitness score, the higher the chances of
being chosen for reproduction.
Selection

• In this phase, individuals are selected for the reproduction of offspring.


• The selected individuals are then arranged in pairs of two to enhance
reproduction.
• These individuals pass on their genes to the next generation.
• The main objective of this phase is to establish the region with high
chances of generating the best solution to the problem (better than the
previous generation).
• The genetic algorithm uses the fitness proportionate selection technique
to ensure that useful solutions are used for recombination.
Reproduction

• This phase involves the creation of a child


population.
• The algorithm employs variation operators that are
applied to the parent population.
• The two main operators in this phase include
crossover and mutation.
Crossover:

• This operator swaps the genetic information of two


parents to reproduce an offspring.
• It is performed on parent pairs that are selected
randomly to generate a child population of equal
size as the parent population.
Mutation:

• This operator adds new genetic information to the new child population.
• This is achieved by flipping some bits in the chromosome.
• Mutation solves the problem of local minimum and enhances
diversification.
• The following image shows how mutation is done
ISSUES IN MACHINE
LEARNING- Lecture 6
Pallavi Shukla
Assistant professor
Computer science & engineering
Inadequate Training Data -

• Lack of quality as well as quantity of data.


• Many data scientists claim that inadequate data, noisy data, and unclean data are
extremely exhausting machine learning algorithms.
• For example, a simple task requires thousands of sample data, and an advanced task
such as speech or image recognition needs millions of sample data examples.
• Further, data quality is also important for the algorithms to work ideally, but the absence
of data quality is also found in Machine Learning applications.
Factors that affect data quality -

• Noisy Data- It is responsible for an inaccurate prediction that affects the decision as well as
accuracy in classification tasks.

• Incorrect data- It is also responsible for faulty programming and results obtained in machine
learning models. Hence, incorrect data may affect the accuracy of the results also.

• Generalizing of output data- Sometimes, it is also found that generalizing output data
becomes complex, which results in comparatively poor future actions.
Poor quality of data-

• Noisy data, incomplete data, inaccurate data, and unclean data


lead to less accuracy in classification and low-quality results.
• Hence, data quality can also be considered as a major common
problem while p
• rocessing machine learning algorithms.
Non-representative training data -

• To make sure our training model is generalized well or not, we have to ensure that sample
training data is representative of new cases that we need to generalize.
• The training data must cover all cases that have already occurred as well as occurring.
• Further, if we are using non-representative training data in the model, it results in less
accurate predictions.
• A machine learning model is said to be ideal if it predicts well for generalized cases and
provides accurate decisions.
• If there is less training data, then there will be a sampling noise in the model, called the non-
representative training set.
• It won't be accurate in predictions.
• To overcome this, it will be biased against one class or a group.
Overfitting and Underfitting -
Overfitting –
Overfitting is one of the most common issues faced by Machine Learning engineers and data scientists.
Whenever a machine learning model is trained with a huge amount of data, it starts capturing noise and
inaccurate data into the training data set.
It negatively affects the performance of the model.
Let's understand with a simple example where we have a few training data sets such as 1000 mangoes,
1000 apples, 1000 bananas, and 5000 papayas.
Then there is a considerable probability of identification of an apple as papaya because we have a massive
amount of biased data in the training data set; hence prediction got negatively affected.
The main reason behind overfitting is using non-linear methods used in machine learning algorithms as
they build non-realistic data models.
We can overcome overfitting by using linear and parametric algorithms in the machine learning models.
Methods to reduce overfitting:

• Increase training data in a dataset.


• Reduce model complexity by simplifying the model by selecting one with fewer parameters
• Ridge Regularization and Lasso Regularization
• Early stopping during the training phase
• Reduce the noise
• Reduce the number of attributes in training data.
• Constraining the model.
Underfitting :

• Underfitting is just the opposite of overfitting.


• Whenever a machine learning model is trained with fewer amounts of data, and as a result, it
provides incomplete and inaccurate data and destroys the accuracy of the machine learning
model.
• Underfitting occurs when our model is too simple to understand the base structure of the
data, just like an undersized pant.
• This generally happens when we have limited data into the data set, and we try to build a
linear model with non-linear data.
• In such scenarios, the complexity of the model destroys, and rules of the machine learning
model become too easy to be applied on this data set, and the model starts doing wrong
predictions as well.
Methods to reduce Underfitting:

• Increase model complexity


• Remove noise from the data
• Trained on increased and better features
• Reduce the constraints
• Increase the number of epochs to get better results.
Monitoring and maintenance -

• As we know that generalized output data is mandatory for any machine


learning model; hence, regular monitoring and maintenance become
compulsory for the same.
• Different results for different actions require data change; hence editing
of codes as well as resources for monitoring them also become necessary.
Getting bad recommendations-

• A machine learning model operates under a specific context which results in bad
recommendations and concept drift in the model.
• Let's understand with an example where at a specific time customer is looking for some
gadgets, but customer requirement change over time but still machine learning model
shows same recommendations to the customer while customer expectations has been
changed.
• This incident is called a Data Drift.
• It generally occurs when new data is introduced or the interpretation of data changes.
However, we can overcome this by regularly updating and monitoring data according to
the expectations.
Lack of skilled resources -

• Although Machine Learning and Artificial Intelligence are continuously


growing in the market, still these industries are fresher in comparison to
others.
• The absence of skilled resources in the form of manpower is also an issue.
• Hence, we need manpower having in-depth knowledge of mathematics,
science, and technologies for developing and managing scientific
substances for machine learning.
Customer Segmentation -

• Customer segmentation is also an important issue while developing a


machine learning algorithm.
• To identify the customers who paid for the recommendations shown by
the model and who don't even check them.
• Hence, an algorithm is necessary to recognize the customer behavior and
trigger a relevant recommendation for the user based on past experience.
Process Complexity of Machine Learning

• The machine learning process is very complex, which is also another major issue faced by
machine learning engineers and data scientists.
• However, Machine Learning and Artificial Intelligence are very new technologies but are
still in an experimental phase and continuously changing over time.
• There is the majority of hits and trial experiments; hence the probability of error is
higher than expected.
• Further, it also includes analyzing the data, removing data bias, training data, applying
complex mathematical calculations, etc., making the procedure more complicated and
quite tedious.
Data Bias

• Data Biasing is also found a big challenge in Machine Learning.


• These errors exist when certain elements of the dataset are heavily weighted or need
more importance than others.
• Biased data leads to inaccurate results, skewed outcomes, and other analytical errors.
• However, we can resolve this error by determining where data is actually biased in the
dataset.
• Further, take necessary steps to reduce it.
Methods to remove Data Bias:

• Research more for customer segmentation.


• Be aware of your general use cases and potential outliers.
• Combine inputs from multiple sources to ensure data diversity.
• Include bias testing in the development process.
• Analyze data regularly and keep tracking errors to resolve them easily.
• Review the collected and annotated data.
• Use multi-pass annotation such as sentiment analysis, content moderation, and intent
recognition.
Lack of Explain ability -

• The outputs cannot be easily comprehended as it is programmed in specific ways to


deliver for certain conditions.
• Hence, a lack of explainability is also found in machine learning algorithms which reduce
the credibility of the algorithms.
Slow implementations and results -

• This issue is also very commonly seen in machine learning models.


• However, machine learning models are highly efficient in producing
accurate results but are time-consuming.
• Slow programming, excessive requirements' and overloaded data
take more time to provide accurate results than expected.
• This needs continuous maintenance and monitoring of the model
for delivering accurate results.
Irrelevant features -

• Although machine learning models are intended to give the


best possible outcome, if we feed garbage data as input, then
the result will also be garbage.
• Hence, we should use relevant features in our training sample.
• A machine learning model is said to be good if training data has
a good set of features or less to no irrelevant features.
DATA SCIENCE VS
MACHINE LEARNING
Data Science Machine Learning
Machine Learning is a field of study
Data Science is a field about processes
that gives computers the capability
and systems to extract data from
to learn without being explicitly
structured and semi-structured data.
programmed.
Combination of Machine and Data
Need the entire analytics universe.
Science.
Machines utilize data science
Branch that deals with data.
techniques to learn about the data.
Data in Data Science maybe or maybe It uses various techniques like
not evolved from a machine or regression and supervised clustering.
mechanical process.
Data Science Machine Learning
Data Science as a broader term not only
But it is only focused on algorithm
focuses on algorithms statistics but also
statistics.
takes care of the data processing.

It is a broad term for multiple disciplines. It fits within data science.

Many operations of data science that is, It is three types: Unsupervised


data gathering, data cleaning, data learning, Reinforcement learning,
manipulation, etc. Supervised learning.
Example: Netflix uses Data Science Example: Facebook uses Machine
technology. Learning technology.

You might also like