You are on page 1of 21

BASICS OF MACHINE LEARNING

What is Machine Learning?


Machine learning is a branch of artificial intelligence (AI) focused on building applications
that learn from data and improve their accuracy over time without being programmed to do
so.
In machine learning, algorithms are 'trained' to find patterns and features in massive
amounts of data to make decisions and predictions based on new data. The better the
algorithm, the more accurate the decisions and predictions will become as it processes
more data.
Today, examples of machine learning are all around us. Digital assistants search the web and
play music in response to our voice commands. Websites recommend products and movies
and songs based on what we bought, watched, or listened to before. Robots vacuum our
floors while we do . . . something better with our time.

Machine Learning – Traditional AI


The journey of AI began in the 1950's when the computing power was a fraction of what it
is today. AI started out with the predictions made by the machine in a fashion a statistician
does predictions using his calculator. Thus, the initial entire AI development was based
mainly on statistical techniques.

Statistical Techniques
The development of today’s AI applications started with using the age-old traditional
statistical techniques. You must have used straight-line interpolation in schools to predict a
future value. There are several other such statistical techniques which are successfully
applied in developing so-called AI programs. We say “so-called” because the AI programs
that we have today are much more complex and use techniques far beyond the statistical
techniques used by the early AI programs.
Some of the examples of statistical techniques that are used for developing AI applications in
those days and are still in practice are listed here-

 Classification: Classification is a part of supervised learning (learning with labelled


data) through which data inputs can be easily separated into categories. In machine
learning, there can be binary classifiers with only two outcomes (e.g., spam, non-
spam) or multi-class classifiers (e.g., types of books, animal species, etc.).
 Decision trees: Decision trees use classified data to make recommendations based
on a set of decision rules. For example, a decision tree that recommends betting on
a particular horse to win, place, or show could use data about the horse (e.g., age,
winning percentage, pedigree) and apply rules to those factors to recommend an
action or decision.
 Clustering : Think of clusters as groups. Clustering focuses on identifying groups of
similar records and labelling the records according to the group to which they
belong. This is done without prior knowledge about the groups and their
characteristics.
 Regressions: Regressions create relationships and correlations between different
types of data. Regression is a type of structured machine learning algorithm where
we can label the inputs and outputs. Linear regression provides outputs with
continuous variables (any value within a range), such as pricing data. Logistical
regression is when variables are categorically dependent, and the labelled variables
are precisely defined.

Here we have listed only some primary techniques that are enough to get you started on AI
without scaring you of the vastness that AI demands. If you are developing AI applications
based on limited data, you would be using these statistical techniques.
However, today the data is abundant. To analyze the kind of huge data that we possess
statistical techniques are of not much help as they have some limitations of their own.
More advanced methods such as deep learning are hence developed to solve many
complex problems.

Types of Machine Learning

– Supervised: Supervised learning is typically the task of machine learning to learn a function
that maps an input to an output based on sample input-output pairs .It uses labelled
training data and a collection of training examples to infer a function. Supervised learning is
carride out when certain goals are identified to be accomplished from a certain set of inputs
i.e., a task driven approach. The most common supervised tasks are “classification” that
separates the data, and “regression” that fts the data. For instance, predicting the class label
or sentiment of a piece of text, like a tweet or a product review, i.e., text classification, is an
example of supervised learning.

– Unsupervised: Unsupervised learning analyses unlabelled datasets without the need for
human interference, i.e., a data-driven process. This is widely used for extracting generative
features, identifying meaningful trends and structures, groupings in results, and exploretory
purposes. The most common unsupervised learning tasks are clustering, density estimation,
feature learning, dimensionality reduction, finding association rules, anomaly detection, etc.

– Semi-supervised: Semi-supervised learning can be defined as a hybridization of the above-


mentioned supervised and unsupervised methods, as it operates on both labelled and
unlabelled data . Thus, it falls between learning “without supervision” and learning “with
supervision”. In the real world, labelled data could be rare in several contexts, and
unlabelled data are numerous, where semi-supervised learning is useful . The goal of a semi-
supervised learning model is to provide a better outcome for prediction than that produced
using the labelled data alone from the model. Some application areas where semi-
supervised learning is used include machine translation, fraud detection, labelling data and
text classification.

– Reinforcement: Reinforcement learning is a type of machine learning algorithm that


enables software agents and machines to automatically evaluate the optimal behaviour in a
particular context or environment to improve its efficiency , i.e., an environment-driven
approach. This type of learning is based on reward or penalty, and its goal is to use insights
obtained from environmental activists to take action to increase the reward or minimize the
risk . It is a powerful tool for training AI models that can help increase automation or
optimize the operational efficiency of sophisticated systems such as robotics, autonomous
driving tasks, manufacturing, and supply chain logistics, however, not preferable to use it for
solving the basic or straight forward problems.

-Deep learning :Deep learning is a subset of machine learning (all deep learning is machine
learning, but not all machine learning is deep learning). Deep learning algorithms define an
artificial neural network that is designed to learn the way the human brain learns. Deep
learning models require large amounts of data that pass through multiple layers of
calculations, applying weights and biases in each successive layer to continually adjust and
improve the outcomes.

-Deep reinforcement learning is a category of machine learning and artificial


intelligence where intelligent machines can learn from their actions similar to the way
humans learn from experience. Inherent in this type of machine learning is that an agent is
rewarded or penalised based on their actions. Actions that get them to the target outcome
are rewarded (reinforced). The "deep" portion of reinforcement learning refers to a multiple
(deep) layers of artificial neural networks that replicate the structure of a human brain.

Supervised Learning

Supervised learning is the types of machine learning in which machines are trained using
well "labelled" training data, and on basis of that data, machines predict the output. The
supervised learning model has a set of input variables (x), and an output variable (y). An
algorithm identifies the mapping function between the input and output variables. The
relationship is y = f(x).

The learning is monitored or supervised in the sense that we already know the output and
the algorithm are corrected each time to optimise its results. The algorithm is trained over
the data set and amended until it achieves an acceptable level of performance.

We can group the supervised learning problems as:


1. Regression problems – Used to predict future values and the model is trained
with the historical data. E.g., Predicting the future price of a product.
2. Classification problems – Various labels train the algorithm to identify items
within a specific category. E.g., Disease or no disease, Apple or an orange, Beer,
or wine.

Algorithms for supervised learning


There are several algorithms available for supervised learning. Some of the widely used
algorithms of supervised learning are as shown below:

k-Nearest Neighbours
Decision Trees
Naive Bayes
Logistic Regression
Support Vector Machines

k-Nearest Neighbours
K Nearest Neighbour is a Supervised Machine Learning algorithm that classifies a new
data point into the target class, counting on the features of its neighbouring data point.
The k-Nearest Neighbours, which is simply called kNN is a statistical technique that can be
used for solving for classification and regression problems. Let us discuss the case of
classifying an unknown object using kNN. Consider the distribution of objects as shown in
the image given below-

The diagram shows three types of objects, marked in red, blue, and green colours. When
you run the kNN classifier on the above dataset, the boundaries for each type of object
will be marked as shown below:
Now, consider a new unknown object that you want to classify as red, green, or blue. This is
depicted in the figure below.

As you see it visually, the unknown data point belongs to a class of blue objects.
Mathematically, this can be concluded by measuring the distance of this unknown point
with every other point in the data set. When you do so, you will know that most of its
neighbours are of blue color. The average distance to red and green objects would be more
than the average distance to blue objects. Thus, this unknown object can be classified as
belonging to blue class.
The kNN algorithm can also be used for regression problems. The kNN algorithm is available
as ready-to-use in most of the ML libraries.
Decision Trees

 Decision Tree is a Supervised learning technique that can be used for both
classification and Regression problems, but mostly it is preferred for solving
Classification problems.
 It is a tree-structured classifier, where internal nodes represent the features of a
dataset, branches represent the decision rules, and each leaf node represents the
outcome.
 In a Decision tree, there are two nodes, which are the Decision Node and Leaf
Node. Decision nodes are used to make any decision and have multiple branches,
whereas Leaf nodes are the output of those decisions and do not contain any further
branches.
 It is a graphical representation for getting all the possible solutions to a
problem/decision based on given conditions.

Naïve Bayes
Naive Bayes is used for creating classifiers. Suppose you want to sort out (classify) fruits
of different kinds from a fruit basket. You may use features such as color, size and shape
of a fruit, for example, any fruit that is red in color, is round and is about 10 cm in
diameter may be considered as Apple. So, to train the model, you would use these
features and test the probability that a given feature matches the desired constraints.
The probabilities of different features are then combined to arrive at a probability that
a given fruit is an Apple. Naive Bayes generally requires a small number of training data
for classification.
Logistic Regression
 It is used for predicting the categorical dependent variable using a given set of
independent variables.
 Logistic regression predicts the output of a categorical dependent variable.
Therefore, the outcome must be a categorical or discrete value. It can be either
yes or no, 0 or 1, true or False, etc. but instead of giving the exact value as 0 and
1, it gives the probabilistic values which lie between 0 and 1.
 Logistic regression typically uses a logistic function to estimate the probabilities.
 The assumption of linearity between the dependent and independent variables is
considered as a major drawback of Logistic Regression. It can be used for both
classification and regression problems, but it is more commonly used for
classification.

Look at the following diagram. It shows the distribution of data points in XY


plane.

From the diagram, we can visually inspect the separation of red dots from green dots.
You may draw a boundary line to separate out these dots. Now, to classify a new
data point, you will just need to determine on which side of the line the point lies.
Support Vector Machine Algorithm
Consider the below diagram in which there are two different categories that are classified
using a decision boundary or hyperplane .

 The goal of the SVM algorithm is to create the best line or decision boundary that
can segregate n-dimensional space into classes so that we can easily put the new
data point in the correct category in the future. This best decision boundary is called
a hyperplane.
 SVM chooses the extreme points/vectors that help in creating the hyperplane. These
extreme cases are called as support vectors, and hence algorithm is termed as
Support Vector Machine.

Unsupervised Learning

This approach is the one where the output is unknown, and we have only the input variable
at hand. The algorithm learns by itself and discovers an impressive structure in the data.
The goal is to decipher the underlying distribution in the data to gain more knowledge about
the data.

o Clustering: Clustering is a method of grouping the objects into clusters such that
objects with most similarities remains into a group and has less or no similarities
with the objects of another group. Cluster analysis finds the commonalities between
the data objects and categorizes them as per the presence and absence of those
commonalities.
o Association: An association rule is an unsupervised learning method which is used
for finding the relationships between variables in the large database. It determines
the set of items that occurs together in the dataset. Association rule makes
marketing strategy more effective. Such as people who buy X item (suppose a bread)
also tend to purchase Y (Butter/Jam) item.

Algorithms for Unsupervised Learning


K-Means Algorithm

K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabelled


dataset into different clusters. Here K defines the number of pre-defined clusters that need
to be created in the process, as if K=2, there will be two clusters, and for K=3, there will be
three clusters, and so on.
It allows us to cluster the data into different groups and a convenient way to discover the
categories of groups in the unlabelled dataset on its own without the need for any training.

The algorithm takes the unlabelled dataset as input, divides the dataset into k-number of
clusters, and repeats the process until it does not find the best clusters. The value of k
should be predetermined in this algorithm.
The k-means clustering algorithm mainly performs two tasks:

o Determines the best value for K center points or centroids by an iterative process.
o Assigns each data point to its closest k-center. Those data points which are near to
the k-center, create a cluster.

Hence each cluster has datapoints with some commonalities, and it is away from other
clusters.

The below diagram explains the working of the K-means Clustering Algorithm:

Cluster Identification
Cluster identification tells an algorithm, “Here’s some data. Now group similar things
together and tell me about those groups.” The key difference from classification is that
in classification you know what you are looking for. While that is not the case in
clustering.
Clustering is sometimes called unsupervised classification because it produces the
same result as classification does but without having predefined classes.
Supervised Learning Unsupervised Learning

Supervised learning algorithms are trained Unsupervised learning algorithms are


using labeled data. trained using unlabeled data.

In supervised learning, input data is In unsupervised learning, only input data is


provided to the model along with the output. provided to the model.

The goal of supervised learning is to train The goal of unsupervised learning is to find
the model so that it can predict the output the hidden patterns and useful insights
when it is given new data. from the unknown dataset.

Supervised learning can be categorized Unsupervised Learning can be classified


in Classification and Regression problems. in Clustering and Associations problems.

Supervised learning can be used for those Unsupervised learning can be used for
cases where we know the input as well as those cases where we have only input data
corresponding outputs. and no corresponding output data.

Supervised learning model produces an Unsupervised learning model may give less
accurate result. accurate result as compared to supervised
learning.

It includes various algorithms such as Linear It includes various algorithms such as


Regression, Logistic Regression, Support Clustering, KNN, and Apriori algorithm.
Vector Machine, Multi-class Classification,
Decision tree, Bayesian Logic, etc.
Artificial Neural Networks
The idea of artificial neural networks was derived from the neural networks in the human
brain. The human brain is complex. Carefully studying the brain, the scientists and engineers
came up with an architecture that could fit in our digital world of binary computers. One
such typical architecture is shown in the diagram below-

There is an input layer which has many sensors to collect data from the outside world. On the
right-hand side, we have an output layer that gives us the result predicted by the network.
In between these two, several layers are hidden. Each additional layer adds further
complexity in training the network but would provide better results in most of the situations.
There are several types of architectures designed which we will discuss now.
ANN Architecture
The diagram below shows several ANN architectures developed over a period and
are in practice today.

Each architecture is developed for a specific type of application. Thus, when you use a
neural network for your machine learning application, you will have to use either one
of the existing architecture or design your own. The type of application that you finally
decide upon depends on your application needs. There is no single guideline that tells
you to use a specific network architecture.

Deep Learning
Deep learning is a subset of machine learning . Deep learning algorithms define an
artificial neural network that is designed to learn the way the human brain learns.
Deep learning models require large amounts of data that pass through multiple
layers of calculations, applying weights and biases in each successive layer to
continually adjust and improve the outcomes.

Deep learning models are typically unsupervised or semi-supervised. Reinforcement


learning models can also be deep learning models. Certain types of deep learning
models—including convolutional neural networks (CNNs) and recurrent neural
networks (RNNs)—are driving progress in areas such as computer vision, natural
language processing (including speech recognition), and self-driving cars.

Artificial Neural Network and Deep Learning


 Deep learning is part of a wider family of artificial neural networks (ANN)-based
machine learning approaches with representation learning. Deep learning provides a
computational architecture by combining several processing layers, such as input,
hidden, and output layers, to learn from data.

 Most deep learning methods use neural network architectures, which is why deep
learning models are often referred to as deep neural networks.

 The term “deep” usually refers to the number of hidden layers in the neural
network. Traditional neural networks only contain 2-3 hidden layers, while deep
networks can have as many as 150.

 Deep learning models are trained by using large sets of labelled data and neural
network architectures that learn features directly from the data without the need for
manual feature extraction.
Disadvantages

Some of the important points that you need to consider before using deep learning
are listed below:
Black Box approach
Duration of Development
Amount of Data
Computationally Expensive

Black Box approach


An ANN is like a blackbox. You give it a certain input and it will provide you a specific output.
The following diagram shows you one such application where you feed an animal image to
a neural network, and it tells you that the image is of a dog.

Why this is called a black-box approach is that you do not know why the network came up
with a certain result. You do not know how the network concluded that it is a dog? Now
consider a banking application where the bank wants to decide the creditworthiness of a
client. The network will provide you an answer to this question. However, will you be able
to justify it to a client? Banks need to explain it to their customers why the loan is not
sanctioned?
Duration of development
The process of training a neural network is depicted in the diagram below-

You first define the problem that you want to solve, create a specification for it, decide
on the input features, design a network, deploy it, and test the output. If the output is
not as expected, take this as a feedback to restructure your network. This is an iterative
process and may require several iterations until the time network is fully trained to
produce desired outputs.

Amount of data
The deep learning networks usually require a huge amount of data for training, while
the traditional machine learning algorithms can be used with a great success even with
just a few thousands of data points. Fortunately, the data abundance is growing at 40%
per year and CPU processing power is growing at 20% per year as seen in the diagram
given below:

Computationally Expensive
Training a neural network requires several times more computational power than the one
required in running traditional algorithms. Successful training of deep Neural Networks may
require several weeks of training time.
In contrast to this, traditional machine learning algorithms take only a few minutes/hours to
train. Also, the amount of computational power needed for training deep neural network
heavily depends on the size of your data and how deep and complex the network is?
After having an overview of what Machine Learning is, its capabilities, limitations, and
applications, let us now dive into learning “Machine Learning”.

Reinforcement Learning
Reinforcements aims at using observations gathered from the interaction with the

environment to take actions that would maximize the reward or minimize the risk.

Reinforcement learning algorithm (called the agent) continuously learns from the

environment in an iterative fashion. In the process, the agent learns from its experiences of

the environment until it explores the full range of possible states.

Reinforcement Learning is a type of Machine Learning, and thereby also a branch of Artificial

Intelligence. It allows machines and software agents to automatically determine the ideal

behaviour within a specific context, to maximize its performance. Simple reward feedback is

required for the agent to learn its behaviour; this is known as the reinforcement signal.
Types of Reinforcement: There are two types of Reinforcement:
1.Positive –
Positive Reinforcement is defined as when an event, occurs due to a particular behaviour,
increases the strength and the frequency of the behaviour. In other words, it has a
positive effect on behaviour.

2.Negative –
Negative Reinforcement is defined as strengthening of a behaviour because a negative
condition is stopped or avoided.

Applications of Machine Learning

Facial recognition/Image recognition

The most common application of machine learning is Facial Recognition, and the simplest
example of this application is the iPhone X. There are a lot of use-cases of facial recognition,
mostly for security purposes like identifying criminals, searching for missing individuals, aid
forensic investigations, etc. Intelligent marketing, diagnose diseases, track attendance in
schools, are some other uses.

Automatic Speech Recognition

Abbreviated as ASR, automatic speech recognition is used to convert speech into digital
text. Its applications lie in authenticating users based on their voice and performing tasks
based on the human voice inputs. Speech patterns and vocabulary are fed into the system
to train the model. Presently ASR systems find a wide variety of applications in the following
domains:

 Medical Assistance
 Industrial Robotics
 Forensic and Law enforcement
 Defence & Aviation
 Telecommunications Industry
 Home Automation and Security Access Control
 I.T. and Consumer Electronics
Financial Services

Machine learning has many use cases in Financial Services. Machine Learning algorithms
prove to be excellent at detecting frauds by monitoring activities of each user and assess
that if an attempted activity is typical of that user or not.
Financial monitoring to detect money laundering activities is also a critical security use case
of machine learning.

Machine Learning also helps in making better trading decisions with the help of algorithms
that can analyse thousands of data sources simultaneously. Credit scoring and underwriting
are some of the other applications.
The most common application in our day-to-day activities is the virtual personal assistants
like Siri and Alexa.

Marketing and Sales

Machine Learning is improving lead scoring algorithms by including various parameters such
as website visits, emails opened, downloads, and clicks to score each lead. It also helps
businesses to improve their dynamic pricing models by using regression techniques to make
predictions.

Sentiment Analysis is another essential application to gauge consumer response to a specific


product or a marketing initiative. Machine Learning for Computer Vision helps brands
identify their products in images and videos online. These brands also use computer vision
to measure the mentions that miss out on any relevant text. Chatbots are also becoming
more responsive and intelligent with the help of machine learning.

Healthcare

A vital application of Machine Learning is in the diagnosis of diseases and ailments, which
are otherwise difficult to diagnose. Radiotherapy is also becoming better with Machine
Learning taking over.
Early-stage drug discovery is another crucial application which involves technologies such as
precision medicine and next-generation sequencing. Clinical trials cost a lot of time and
money to complete and deliver results. Applying Machine Learning based predictive
analytics could improve on these factors and give better results.

Machine Learning technologies are also critical to make outbreak predictions. Scientists
around the world are using these technologies to predict epidemic outbreaks.

You might also like