You are on page 1of 21

lOMoARcPSD|26365655

Unit I Notes - Machine Learning Techniques

Machine Learning Techniques (Dr. A.P.J. Abdul Kalam Technical University)

Studocu is not sponsored or endorsed by any college or university


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

Machine Learning Techniques (KCS 055)

UNIT-I

INTRODUCTION – Learning, Types of Learning, Well defined learning problems, Designing a Learning
System, History of ML, Introduction of Machine Learning Approaches – (Artificial Neural Network,
Clustering, Reinforcement Learning, Decision Tree Learning, Bayesian networks, Support Vector
Machine, Genetic Algorithm), Issues in Machine Learning and Data Science Vs Machine Learning;

Learning: Definition
A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks T, as measured by P, improves with experience E.
Examples
i) Handwriting recognition learning problem
• Task T: Recognizing and classifying handwritten words within images
• Performance P: Percent of words correctly classified
• Training experience E: A dataset of handwritten words with given classifications
ii) A robot driving learning problem
• Task T: Driving on highways using vision sensors
• Performance measure P: Average distance traveled before an error
• Training experience: A sequence of images and steering commands recorded while observing a human
driver
iii) A chess learning problem
• Task T: Playing chess
• Performance measure P: Percent of games won against opponents
• Training experience E: Playing practice games against itself

Definition
A computer program which learns from experience is called a machine learning program or
simply a learning program. Such a program is sometimes also referred to as a learner.

History of machine learning

Machine learning was first conceived from the mathematical modeling of neural networks. A paper by
logician Walter Pitts and neuroscientist Warren McCulloch, published in 1943, attempted to
mathematically map out thought processes and decision making in human cognition.

In 1950, Alan Turning proposed the Turing Test, which became the litmus test for which machines were
deemed "intelligent" or "unintelligent." The criteria for a machine to receive status as an "intelligent"
machine, was for it to have the ability to convince a human being that it, the machine, was also a human

1-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

being. AI and machine learning algorithms aren’t new. The field of AI dates back to the 1950s. Arthur
Lee Samuels, an IBM researcher, developed one of the earliest machine learning programs — a
self-learning program for playing checkers. In fact, he coined the term machine learning. His approach
to machine learning was explained in a paper published in the IBM Journal of Research and
Development in 1959.Over the decades, AI techniques have been widely used as a method of
improving the performance of underlying code. In the last few years with the focus on distributed
computing models and cheaper compute and storage, there has been a surge of interest in AI and
machine learning that has lead to a huge amount of money being invested in startup software
companies.

What Is Machine Learning?


Machine learning is programming computers to optimize a performance criterion using example data or
past experience. We have a model defined up to some parameters, and learning is the execution
of a computer program to optimize the parameters of the model using the training data or past
experience. The model may be predictive to make predictions in the future, or descriptive to gain
knowledge from data, or both.
Arthur Samuel, an early American leader in the field of computer gaming and artificial
intelligence, coined the term <Machine Learning= in 1959 while at IBM. He defined machine learning as
<the field of study that gives computers the ability to learn without being explicitly programmed.=
However, there is no universally accepted definition for machine learning. Different authors define the
term differently.
Components of Learning
The learning process, whether by a human or a machine, can be divided into four components, namely,
data storage, abstraction, generalization and evaluation. Fallowing figure illustrates the various
components and the steps involved in the learning process.

2-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

1. Data storage
Facilities for storing and retrieving huge amounts of data are an important component of the learning
process. Humans and computers alike utilize data storage as a foundation for advanced
reasoning.
• In a human being, the data is stored in the brain and data is retrieved using electrochemical signals.
• Computers use hard disk drives, flash memory, random access memory and similar devices to store
data and use cables and other technology to retrieve data.
2. Abstraction
The second component of the learning process is known as abstraction. Abstraction is the process of
extracting knowledge about stored data. This involves creating general concepts about the data as a
whole. The creation of knowledge involves application of known models and creation of new models.
The process of fitting a model to a dataset is known as training. When the model has been trained, the
data is transformed into an abstract form that summarizes the original information.
3. Generalization
The third component of the learning process is known as generalization. The term generalization
describes the process of turning the knowledge about stored data into a form that can be utilized for
future action. These actions are to be carried out on tasks that are similar, but not identical, to those
what have been seen before. In generalization, the goal is to discover those properties of the data that
will be most relevant to future tasks.
4. Evaluation
Evaluation is the last component of the learning process. It is the process of giving feedback to the user
to measure the utility of the learned knowledge. This feedback is then utilized to effect improvements in
the whole learning process.

Applications of Machine learning

1. Image Recognition:

Image recognition is one of the most common applications of machine learning. It is used to identify
objects, persons, places, digital images, etc. The popular use case of image recognition and face
detection is, Automatic friend tagging suggestion: Face book provides us a feature of auto friend
tagging suggestion. Whenever we upload a photo with our Face book friends, then we automatically get
a tagging suggestion with name, and the technology behind this is machine learning's face
detection and recognition algorithm.

3-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

2. Speech Recognition

While using Google, we get an option of "Search by voice," it comes under speech recognition, and it's a
popular application of machine learning.Speech recognition is a process of converting voice instructions
into text, and it is also known as "Speech to text", or "Computer speech recognition." At present,
machine learning algorithms are widely used by various applications of speech recognition. Google
assistant, Siri, Cortana, and Alexa are using speech recognition technology to follow the voice
instructions.

3. Traffic prediction:

If we want to visit a new place, we take help of Google Maps, which shows us the correct path with the
shortest route and predicts the traffic conditions. It predicts the traffic conditions such as whether traffic
is cleared, slow-moving, or heavily congested with the help of two ways:

o Real Time location of the vehicle form Google Map app and sensors
o Average time has taken on past days at the same time.

Everyone who is using Google Map is helping this app to make it better. It takes information from the
user and sends back to its database to improve the performance.

4-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

4. Product recommendations:

Machine learning is widely used by various e-commerce and entertainment companies such
as Amazon, Netflix, etc., for product recommendation to the user. Whenever we search for some
product on Amazon, then we started getting an advertisement for the same product while internet
surfing on the same browser and this is because of machine learning. Google understands the user
interest using various machine learning algorithms and suggests the product as per customer interest.
As similar, when we use Netflix, we find some recommendations for entertainment series, movies, etc.,
and this is also done with the help of machine learning.

5. Self-driving cars:

One of the most exciting applications of machine learning is self-driving cars. Machine learning plays a
significant role in self-driving cars. Tesla, the most popular car manufacturing company is working on
self-driving car. It is using unsupervised learning method to train the car models to detect people and
objects while driving.

6. Email Spam and Malware Filtering:

Whenever we receive a new email, it is filtered automatically as important, normal, and spam. We
always receive an important mail in our inbox with the important symbol and spam emails in our spam
box, and the technology behind this is Machine learning. Below are some spam filters used by Gmail:

o Content Filter
o Header filter
o General blacklists filter
o Rules-based filters
o Permission filters

Some machine learning algorithms such as Multi-Layer Perceptron, Decision tree, and Naïve Bayes
classifier are used for email spam filtering and malware detection.

7. Virtual Personal Assistant:

We have various virtual personal assistants such as Google assistant, Alexa, Cortana, Siri. As the name
suggests, they help us in finding the information using our voice instruction. These assistants can help us
in various ways just by our voice instructions such as Play music, call someone, Open an email,
Scheduling an appointment, etc. These virtual assistants use machine learning algorithms as an
important part. These assistant record our voice instructions, send it over the server on a cloud, and
decode it using ML algorithms and act accordingly.

8. Online Fraud Detection:

Machine learning is making our online transaction safe and secure by detecting fraud transaction.
Whenever we perform some online transaction, there may be various ways that a fraudulent

5-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

transaction can take place such as fake accounts, fake ids, and steal money in the middle of a
transaction. So to detect this, Feed Forward Neural network helps us by checking whether it is a
genuine transaction or a fraud transaction. For each genuine transaction, the output is converted into
some hash values, and these values become the input for the next round. For each genuine transaction,
there is a specific pattern which gets change for the fraud transaction hence, it detects it and makes our
online transactions more secure.

9. Stock Market trading:

Machine learning is widely used in stock market trading. In the stock market, there is always a risk of up
and downs in shares, so for this machine learning's long short term memory neural network is used for
the prediction of stock market trends.

10. Medical Diagnosis:

In medical science, machine learning is used for diseases diagnoses. With this, medical technology is
growing very fast and able to build 3D models that can predict the exact position of lesions in the brain.
It helps in finding brain tumors and other brain-related diseases easily.

11. Automatic Language Translation:

Nowadays, if we visit a new place and we are not aware of the language then it is not a problem at all, as
for this also machine learning helps us by converting the text into our known languages. Google's GNMT
(Google Neural Machine Translation) provide this feature, which is a Neural Machine Learning that
translates the text into our familiar language, and it called as automatic translation.

Well posed learning problems


A computer program is said to learn from experience E in context to some task T and some performance
measure P, if its performance on T, as was measured by P, upgrades with experience E.
Any problem can be segregated as well-posed learning problem if it has three traits –
 Task
 Performance Measure
 Experience

Role of Well Posed Learning Problem in Machine Learning:

1. A (machine learning) problem is well-posed if a solution to it exists, if that solution is unique,


and if that solution depends on the data / experience but it is not sensitive to (reasonably small)
changes in the data / experience.
2. Learning to recognize spoken words gives successful speech recognition systems employ
machine learning in some form.
3. Learning to drive an autonomous vehicle gives methods to train the computer controlled
vehicles to steer correctly when driving on a variety of roads.
4. Learning to classify new astronomical structures gives methods applied to a variety of large
databases to learn regularities present in the data.
5. Learning to play world class chess all by itself.

6-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

Activity in Class: Please define three Terms E, T & P in fallowing problems-

1) A checkers learning problem


2) A robot driving learning problem
3) A face recognition learning problem

Important Component of Learning

Learning Models

Machine learning is concerned with using the right features to build the right models that
achieve the right tasks. The basic idea of learning models has divided into three categories. For a given
problem, the collection of all possible outcomes represents the sample space or instance space.
 Using a Logical expression. (Logical models)
 Using the Geometry of the instance space. (Geometric models)
 Using Probability to classify the instance space. (Probabilistic models)
 Grouping and Grading

Designing a Learning System


For any learning system, we must be knowing the three elements — T (Task), P (Performance Measure),
and E (Training Experience). At a high level, the process of learning system looks as below.

7-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

The learning process starts with task T, performance measure P and training experience E and objective
are to find an unknown target function. The target function is an exact knowledge to be learned from
the training experience and it’s unknown.

Design of a learning system

When we want to design a learning system that follows the learning process, we need to consider a few
design choices. The design choices will be to decide the following key components:
1. Type of training experience
2. Choosing the Target Function
3. Choosing a representation for the Target Function
4. Choosing an approximation algorithm for the Target Function
5. The final Design

We will look into the game - checkers learning problem and apply the above design choices. For
a checkers learning problem, the three elements will be,
1. Task T: To play checkers
2. Performance measure P: Total percent of the game won in the tournament.
3. Training experience E: A set of games played against itself

Key Terminology

Classifier: A method that receives a new input as an unlabeled instance of an observation or feature
and identifies a category or class to which it belongs. Many commonly used classifiers employ statistical
inference (probability measure) to categorize the best label for a given instance.

Confusion matrix (aka error matrix): A matrix that visualizes the performance of the classification
algorithm using the data in the matrix. It compares the predicted classification against the actual
classification in the form of false positive, true positive, false negative and true negative information.

Accuracy (aka err or rate): The rate of correct (or incorrect) predictions made by the model over
a dataset. Accuracy is usually estimated by using an independent test set that was not used at
any time during the learning process. More complex accuracy estimation techniques, such as cross-
validation and bootstrapping, are commonly used, especially with datasets containing a small number of
instances.
Cost : The measurement of performance (or accuracy) of a model that predicts (or evaluates) the
outcome for an established result; in other words, that quantifies the deviation between
predicted and actual values (or class labels). An optimization function attempts to minimize the cost
function.

8-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

Cross-validation: A verification technique that evaluates the generalization ability of a model for an
independent dataset. It defines a dataset that is used for testing the trained model during the
training phase for over fitting. Cross-validation can also be used to evaluate the performance of
various prediction functions. In k-fold cross-validation, the training dataset is arbitrarily partitioned
into k mutually exclusive subsamples (or folds) of equal sizes. The model is trained k times (or
folds), where each iteration uses one of the k subsamples for testing (cross validating), and the
remaining k-1 subsamples are applied toward training the model. The k results of cross-validation
are averaged to estimate the accuracy as a single estimation.
Data mining: The process of knowledge discovery or pattern detection in a large dataset. The methods
involved in data mining aid in extracting the accurate data and transforming it to a known structure for
further evaluation.
Dataset: A collection of data that conform to a schema with no ordering requirements. In a
typical dataset, each column represents a feature and each row represents a member of the
dataset.
Dimension: A set of attributes that defines a property. The primary functions of dimension are filtering,
classification, and grouping.
Induction algorithm: An algorithm that uses the training dataset to generate a model that
generalizes beyond the training dataset.
Instance: An object characterized by feature vectors from which the model is either trained for
generalization or used for prediction.
Knowledge discovery: The process of abstracting knowledge from structured or unstructured sources
to serve as the basis for further exploration. Such knowledge is collectively represented as a schema
and can be condensed in the form of a model or models to which queries can be made for
statistical prediction, evaluation, and further knowledge discovery .
Model: A structure that summarizes a dataset for description or prediction. Each model can be
tuned to the specific requirements of an application. Applications in big data have large datasets with
many predictors and features that are too complex for a simple parametric model to extract
useful information. The learning process synthesizes the parameters and the structures of a
model from a given dataset.

Online Analytical Processing (OLAP): An approach for resolving multidimensional analytical queries.
Such queries index into the data with two or more attributes (or dimensions). OLAP encompasses a
broad class of business intelligence data and is usually synonymous with multidimensional OLAP
(MOLAP). OLAP engines facilitate the exploration of multidimensional data interactively from
several perspectives, thereby allowing for complex analytical and ad hoc queries with a rapid
execution time.

Schema: A high-level specification of a dataset’s attributes and properties.

Feature vector. An n-dimensional numerical vector of explanatory variables representing an instance


of some object that facilitates processing and statistical analysis. Feature vectors are often weighted
to construct a predictor function that is used to evaluate the quality or fitness of the prediction. The
dimensionality of a feature vector can be reduced by various dimensionality reduction techniques, such
as principal component analysis (PCA), multi-linear subspace reduction, iso-maps, and latent
semantic analysis (LSA). The vector space associated with these vectors is often called the feature e
space.

9-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

Types of Learning
In general, machine learning algorithms can be classified into three types.
 Supervised learning
 Unsupervised learning
 Reinforcement learning

Supervised learning

Supervised learning is a learning mechanism that infers the underlying relationship between the
observed data (also called input data) and a target variable (a dependent variable or label) that
is subject to prediction. The learning task uses the labeled training data (training examples) to
synthesize the model function that attempts to generalize the underlying relationship between the
feature vectors (input) and the supervisory signals (output). The feature vectors influence the
direction and magnitude of change in order to improve the overall performance of the function
model. The training data comprise observed input (feature) vectors and a desired output value
(also called the supervisory signal or class label).

High-level flow of supervised learning

Supervised learning is classified into two categories of algorithms:


 Classification: A classification problem is when the output variable is a category, such as <Red= or
<blue=, <disease= or <no disease=.
 Regression: A regression problem is when the output variable is a real value, such as <dollars= or
<weight=.

10-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

Supervised learning deals with or learns with <labeled= data. This implies that some data is already
tagged with the correct answer.
Types:-
 Regression
 Logistic Regression
 Classification
 Naive Bayes Classifiers
 K-NN (k nearest neighbors)
 Decision Trees
 Support Vector Machine
Advantages:-
 Supervised learning allows collecting data and produces data output from previous experiences.
 Helps to optimize performance criteria with the help of experience.
 Supervised machine learning helps to solve various types of real-world computation problems.
Disadvantages:-
 Classifying big data can be challenging.
 Training for supervised learning needs a lot of computation time. So, it requires a lot of time.

Unsupervised learning

Unsupervised learning algorithms are designed to discover hidden structures in unlabeled


datasets, in which the desired output is unknown. This mechanism has found many uses in the
areas of data compression, outlier detection, classification, human learning, and so on. The
general approach to learning involves training through probabilistic data models. Two popular examples
of unsupervised learning are clustering and dimensionality reduction.

Types of Unsupervised Learning Algorithm:


The unsupervised learning algorithm can be further categorized into two types of problems:

11-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

o Clustering: Clustering is a method of grouping the objects into clusters such that objects with
most similarities remains into a group and has less or no similarities with the objects of another
group. Cluster analysis finds the commonalities between the data objects and categorizes them
as per the presence and absence of those commonalities.
o Association: An association rule is an unsupervised learning method which is used for finding
the relationships between variables in the large database. It determines the set of items that
occurs together in the dataset. Association rule makes marketing strategy more effective. Such
as people who buy X item (suppose a bread) are also tend to purchase Y (Butter/Jam) item. A
typical example of Association rule is Market Basket Analysis.

Reinforcement Learning
o Reinforcement Learning is a feedback-based Machine learning technique in which an agent
learns to behave in an environment by performing the actions and seeing the results of actions.
For each good action, the agent gets positive feedback, and for each bad action, the agent gets
negative feedback or penalty.
o In Reinforcement Learning, the agent learns automatically using feedbacks without any labeled
data, unlike supervised learning.
o Since there is no labeled data, so the agent is bound to learn by its experience only.
o RL solves a specific type of problem where decision making is sequential, and the goal is long-
term, such as game-playing, robotics, etc.

Terms used in Reinforcement Learning


o Agent (): An entity that can perceive/explore the environment and act upon it.
o Environment (): A situation in which an agent is present or surrounded by. In RL, we assume the
stochastic environment, which means it is random in nature.
o Action (): Actions are the moves taken by an agent within the environment.
o State (): State is a situation returned by the environment after each action taken by the agent.
o Reward (): A feedback returned to the agent from the environment to evaluate the action of the
agent.
o Policy (): Policy is a strategy applied by the agent for the next action based on the current state.

12-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

o Value (): It is expected long-term retuned with the discount factor and opposite to the short-
term reward.
o Q-value (): It is mostly similar to the value, but it takes one additional parameter as a current
action (a).

Advantages
Reinforcement learning is used to solve complex problems that cannot be solved by conventional
techniques. This learning model is very similar to the learning of human beings. Hence, it is close to
achieving perfection.
Disadvantages
Too much reinforcement learning can lead to an overload of states which can diminish the results,
also it is not preferable for solving simple problems. The curse of dimensionality limits reinforcement
learning for real physical systems.

ANN (Artificial Neural Network)


Proposed in the 1940s as a simplified model of the elementary computing unit in the human
cortex, artificial neural networks (ANNs) have since been an active research area. The term
"Artificial neural network" refers to a biologically inspired sub-field of artificial intelligence modeled
after the brain. An Artificial neural network is usually a computational network based on biological
neural networks that construct the structure of the human brain.

Artificial Neural Network primarily consists of three layers:

13-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

Input Layer:
As the name suggests, it accepts inputs in several different formats provided by the programmer.
Hidden Layer:
The hidden layer presents in-between input and output layers. It performs all the calculations to find
hidden features and patterns.
Output Layer:
The input goes through a series of transformations using the hidden layer, which finally results in output
that is conveyed using this layer.
The artificial neural network takes input and computes the weighted sum of the inputs and includes a
bias. This computation is represented in the form of a transfer function.

Advantages of Artificial Neural Network (ANN)


Parallel processing capability:
Artificial neural networks have a numerical value that can perform more than one task simultaneously.
Storing data on the entire network:
Data that is used in traditional programming is stored on the whole network, not on a database. The
disappearance of a couple of pieces of data in one place doesn't prevent the network from working.
Capability to work with incomplete knowledge:
After ANN training, the information may produce output even with inadequate data. The loss of
performance here relies upon the significance of missing data.
Having a memory distribution:
For ANN is to be able to adapt, it is important to determine the examples and to encourage the network
according to the desired output by demonstrating these examples to the network. The succession of the
network is directly proportional to the chosen instances, and if the event can't appear to the network in
all its aspects, it can produce false output.
Having fault tolerance:
Extortion of one or more cells of ANN does not prohibit it from generating output, and this feature
makes the network fault-tolerance.

Disadvantages of Artificial Neural Network


Assurance of proper network structure:

14-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

There is no particular guideline for determining the structure of artificial neural networks. The
appropriate network structure is accomplished through experience, trial, and error.
Unrecognized behavior of the network:
It is the most significant issue of ANN. When ANN produces a testing solution, it does not provide insight
concerning why and how. It decreases trust in the network.
Hardware dependence:
Artificial neural networks need processors with parallel processing power, as per their structure.
Therefore, the realization of the equipment is dependent.
Difficulty of showing the issue to the network:
ANNs can work with numerical data. Problems must be converted into numerical values before being
introduced to ANN. The presentation mechanism to be resolved here will directly impact the
performance of the network. It relies on the user's abilities.
The duration of the network is unknown:
The network is reduced to a specific value of the error, and this value does not give us optimum results.

Clustering
A way of grouping the data points into different clusters, consisting of similar data points. The objects
with the possible similarities remain in a group that has less or no similarities with another group

Types of Clustering Methods

The clustering methods are broadly divided into Hard clustering (data point belongs to only one group)
and Soft Clustering (data points can belong to another group also). But there are also other various
approaches of Clustering exist. Below are the main clustering methods used in Machine learning:

15-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

1. Partitioning Clustering
2. Density-Based Clustering
3. Distribution Model-Based Clustering
4. Hierarchical Clustering
5. Fuzzy Clustering

The clustering technique can be widely used in various tasks. Some most common uses of this technique are:

o Market Segmentation
o Statistical data analysis
o Social network analysis
o Image segmentation
o Anomaly detection, etc.

Decision Tree Classification Algorithm

Decision Tree is a supervised learning technique that can be used for both classification and Regression
problems, but mostly it is preferred for solving Classification problems. It is a tree-structured classifier,
where internal nodes represent the features of a dataset, branches represent the decision
rules and each leaf node represents the outcome. In a Decision tree, there are two nodes, which are
the Decision Node and Leaf Node. Decision nodes are used to make any decision and have multiple
branches, whereas Leaf nodes are the output of those decisions and do not contain any further
branches.

Decision Tree Terminologies


Root Node: Root node is from where the decision tree starts. It represents the entire dataset, which
further gets divided into two or more homogeneous sets.

16-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further after getting
a leaf node.
Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes according to the
given conditions.
Branch/Sub Tree: A tree formed by splitting the tree.
Pruning: Pruning is the process of removing the unwanted branches from the tree.
Parent/Child node: The root node of the tree is called the parent node, and other nodes are called the
child nodes.

Differences between Classification and Clustering


1. Classification is used for supervised learning whereas clustering is used for unsupervised learning.
2. The process of classifying the input instances based on their corresponding class labels is known as
classification whereas grouping the instances based on their similarity without the help of class
labels is known as clustering.
3. As Classification have labels so there is need of training and testing dataset for verifying the model
created but there is no need for training and testing dataset in clustering.
4. Classification is more complex as compared to clustering as there are many levels in the
classification phase whereas only grouping is done in clustering.
5. Classification examples are Logistic regression, Naive Bayes classifier, Support vector machines,
etc. Whereas clustering examples are k-means clustering algorithm, Fuzzy c-means clustering
algorithm, Gaussian (EM) clustering algorithm, etc.
6.

Bayesian Belief Networks


Bayesian Belief Network is a graphical representation of different probabilistic relationships among
random variables in a particular set. It is a classifier with no dependency on attributes i.e it is
condition independent. Due to its feature of joint probability, the probability in Bayesian Belief
Network is derived, based on a condition — P(attribute/parent) i.e probability of an attribute, true
over parent attribute.
Real world applications are probabilistic in nature, and to represent the relationship between multiple
events, we need a Bayesian network. It can also be used in various tasks including prediction, anomaly

17-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

detection, diagnostics, automated insight, reasoning, time series prediction, and decision making under
uncertainty.

Bayesian Network can be used for building models from data and experts opinions, and it consists of
two parts:

o Directed Acyclic Graph


o Table of conditional probabilities.

The generalized form of Bayesian network that represents and solve decision problems under uncertain
knowledge is known as an Influence diagram.

A Bayesian network graph is made up of nodes and Arcs (directed links), where:

o Each node corresponds to the random variables, and a variable can be continuous or discrete.
o Arc or directed arrows represent the causal relationship or conditional probabilities between
random variables. These directed links or arrows connect the pair of nodes in the graph.
These links represent that one node directly influence the other node, and if there is no directed
link that means that nodes are independent with each other
o In the above diagram, A, B, C, and D are random variables represented by the nodes of
the network graph.
o If we are considering node B, which is connected with node A by a directed arrow,
then node A is called the parent of Node B.
o Node C is independent of node A.

The Bayesian network has mainly two components:

o Causal Component
o Actual numbers

Each node in the Bayesian network has condition probability distribution P(Xi |Parent(Xi) ), which
determines the effect of the parent on that node.

18-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

Support Vector Machine Algorithm

Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms, which is
used for Classification as well as Regression problems. However, primarily, it is used for Classification
problems in Machine Learning. The goal of the SVM algorithm is to create the best line or decision
boundary that can segregate n-dimensional space into classes so that we can easily put the new data
point in the correct category in the future. This best decision boundary is called a hyperplane. SVM
chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are called
as support vectors, and hence algorithm is termed as Support Vector Machine. Consider the below
diagram in which there are two different categories that are classified using a decision boundary or
hyperplane:

The followings are important concepts in SVM −


 Support Vectors – Data points that are closest to the hyperplane is called support vectors.
Separating line will be defined with the help of these data points.
 Hyperplane − As we can see in the above diagram, it is a decision plane or space which is divided
between a set of objects having different classes.
 Margin − It may be defined as the gap between two lines on the closet data points of different
classes. It can be calculated as the perpendicular distance from the line to the support vectors.
Large margin is considered as a good margin and small margin is considered as a bad margin.

Advantages of SVM:
 Effective in high dimensional cases
 Its memory efficient as it uses a subset of training points in the decision function called support
vectors
 Different kernel functions can be specified for the decision functions and its possible to specify
custom kernels

19-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)
lOMoARcPSD|26365655

w.e.f: January 2020


Axis Institute of Technology & Management, Kanpur Form No. Acad-
Department of Computer Science & Engineering 006A

Session: 2022-23 Semesters: V Section:


Course Code: KCS055 Course Name: Machine Learning Techniques

Assignment 1

Course
Question Outcome No. , Title of Questions
No Blooms Level

1 CO1, What is a <Well -posed Learning <problem? Explain with an example


Remember

2 CO1, Design the Final design of checkers learning problem.


Remember

3 CO1, Explain the <Concept Learning= Task with an example


Remember

4 CO1, Discuss about the Historical progress of Machine Learning. What is the concept
Remember of Clustering in ML.
Find the maximally general hypothesis and maximally specific hypothesis for the
training examples given in the table using the candidate elimination algorithm.
Given Training Example:
5 CO1, Sky Temp Humidity wind water Forecast Sport
Remember
Sunny warm Normal Strong warm same Yes
Sunny warm High Strong warm same Yes
Rainy cold High Strong warm change No
Sunny warm High Strong cool change Yes

20-MLT/KCS055/CSE/IT/CSDS/AITM/DR ABHAY SHUKLA


Downloaded by Anup Yadav (anupyadavgzp0890@gmail.com)

You might also like