Professional Documents
Culture Documents
Statistical Techniques
The development of today’s AI applications started with using the age-old traditional
statistical techniques. You must have used straight-line interpolation in schools to predict a
future value. There are several other such statistical techniques which are successfully
applied in developing so-called AI programs. We say “so-called” because the AI programs
that we have today are much more complex and use techniques far beyond the statistical
techniques used by the early AI programs.
Some of the examples of statistical techniques that are used for developing AI applications in
those days and are still in practice are listed here-
Here we have listed only some primary techniques that are enough to get you started on AI
without scaring you of the vastness that AI demands. If you are developing AI applications
based on limited data, you would be using these statistical techniques.
However, today the data is abundant. To analyze the kind of huge data that we possess
statistical techniques are of not much help as they have some limitations of their own.
More advanced methods such as deep learning are hence developed to solve many
complex problems.
– Supervised: Supervised learning is typically the task of machine learning to learn a function
that maps an input to an output based on sample input-output pairs .It uses labelled
training data and a collection of training examples to infer a function. Supervised learning is
carride out when certain goals are identified to be accomplished from a certain set of inputs
i.e., a task driven approach. The most common supervised tasks are “classification” that
separates the data, and “regression” that fts the data. For instance, predicting the class label
or sentiment of a piece of text, like a tweet or a product review, i.e., text classification, is an
example of supervised learning.
– Unsupervised: Unsupervised learning analyses unlabelled datasets without the need for
human interference, i.e., a data-driven process. This is widely used for extracting generative
features, identifying meaningful trends and structures, groupings in results, and exploretory
purposes. The most common unsupervised learning tasks are clustering, density estimation,
feature learning, dimensionality reduction, finding association rules, anomaly detection, etc.
-Deep learning :Deep learning is a subset of machine learning (all deep learning is machine
learning, but not all machine learning is deep learning). Deep learning algorithms define an
artificial neural network that is designed to learn the way the human brain learns. Deep
learning models require large amounts of data that pass through multiple layers of
calculations, applying weights and biases in each successive layer to continually adjust and
improve the outcomes.
Supervised Learning
Supervised learning is the types of machine learning in which machines are trained using
well "labelled" training data, and on basis of that data, machines predict the output. The
supervised learning model has a set of input variables (x), and an output variable (y). An
algorithm identifies the mapping function between the input and output variables. The
relationship is y = f(x).
The learning is monitored or supervised in the sense that we already know the output and
the algorithm are corrected each time to optimise its results. The algorithm is trained over
the data set and amended until it achieves an acceptable level of performance.
k-Nearest Neighbours
Decision Trees
Naive Bayes
Logistic Regression
Support Vector Machines
k-Nearest Neighbours
K Nearest Neighbour is a Supervised Machine Learning algorithm that classifies a new
data point into the target class, counting on the features of its neighbouring data point.
The k-Nearest Neighbours, which is simply called kNN is a statistical technique that can be
used for solving for classification and regression problems. Let us discuss the case of
classifying an unknown object using kNN. Consider the distribution of objects as shown in
the image given below-
The diagram shows three types of objects, marked in red, blue, and green colours. When
you run the kNN classifier on the above dataset, the boundaries for each type of object
will be marked as shown below:
Now, consider a new unknown object that you want to classify as red, green, or blue. This is
depicted in the figure below.
As you see it visually, the unknown data point belongs to a class of blue objects.
Mathematically, this can be concluded by measuring the distance of this unknown point
with every other point in the data set. When you do so, you will know that most of its
neighbours are of blue color. The average distance to red and green objects would be more
than the average distance to blue objects. Thus, this unknown object can be classified as
belonging to blue class.
The kNN algorithm can also be used for regression problems. The kNN algorithm is available
as ready-to-use in most of the ML libraries.
Decision Trees
Decision Tree is a Supervised learning technique that can be used for both
classification and Regression problems, but mostly it is preferred for solving
Classification problems.
It is a tree-structured classifier, where internal nodes represent the features of a
dataset, branches represent the decision rules, and each leaf node represents the
outcome.
In a Decision tree, there are two nodes, which are the Decision Node and Leaf
Node. Decision nodes are used to make any decision and have multiple branches,
whereas Leaf nodes are the output of those decisions and do not contain any further
branches.
It is a graphical representation for getting all the possible solutions to a
problem/decision based on given conditions.
Naïve Bayes
Naive Bayes is used for creating classifiers. Suppose you want to sort out (classify) fruits
of different kinds from a fruit basket. You may use features such as color, size and shape
of a fruit, for example, any fruit that is red in color, is round and is about 10 cm in
diameter may be considered as Apple. So, to train the model, you would use these
features and test the probability that a given feature matches the desired constraints.
The probabilities of different features are then combined to arrive at a probability that
a given fruit is an Apple. Naive Bayes generally requires a small number of training data
for classification.
Logistic Regression
It is used for predicting the categorical dependent variable using a given set of
independent variables.
Logistic regression predicts the output of a categorical dependent variable.
Therefore, the outcome must be a categorical or discrete value. It can be either
yes or no, 0 or 1, true or False, etc. but instead of giving the exact value as 0 and
1, it gives the probabilistic values which lie between 0 and 1.
Logistic regression typically uses a logistic function to estimate the probabilities.
The assumption of linearity between the dependent and independent variables is
considered as a major drawback of Logistic Regression. It can be used for both
classification and regression problems, but it is more commonly used for
classification.
From the diagram, we can visually inspect the separation of red dots from green dots.
You may draw a boundary line to separate out these dots. Now, to classify a new
data point, you will just need to determine on which side of the line the point lies.
Support Vector Machine Algorithm
Consider the below diagram in which there are two different categories that are classified
using a decision boundary or hyperplane .
The goal of the SVM algorithm is to create the best line or decision boundary that
can segregate n-dimensional space into classes so that we can easily put the new
data point in the correct category in the future. This best decision boundary is called
a hyperplane.
SVM chooses the extreme points/vectors that help in creating the hyperplane. These
extreme cases are called as support vectors, and hence algorithm is termed as
Support Vector Machine.
Unsupervised Learning
This approach is the one where the output is unknown, and we have only the input variable
at hand. The algorithm learns by itself and discovers an impressive structure in the data.
The goal is to decipher the underlying distribution in the data to gain more knowledge about
the data.
o Clustering: Clustering is a method of grouping the objects into clusters such that
objects with most similarities remains into a group and has less or no similarities
with the objects of another group. Cluster analysis finds the commonalities between
the data objects and categorizes them as per the presence and absence of those
commonalities.
o Association: An association rule is an unsupervised learning method which is used
for finding the relationships between variables in the large database. It determines
the set of items that occurs together in the dataset. Association rule makes
marketing strategy more effective. Such as people who buy X item (suppose a bread)
also tend to purchase Y (Butter/Jam) item.
The algorithm takes the unlabelled dataset as input, divides the dataset into k-number of
clusters, and repeats the process until it does not find the best clusters. The value of k
should be predetermined in this algorithm.
The k-means clustering algorithm mainly performs two tasks:
o Determines the best value for K center points or centroids by an iterative process.
o Assigns each data point to its closest k-center. Those data points which are near to
the k-center, create a cluster.
Hence each cluster has datapoints with some commonalities, and it is away from other
clusters.
The below diagram explains the working of the K-means Clustering Algorithm:
Cluster Identification
Cluster identification tells an algorithm, “Here’s some data. Now group similar things
together and tell me about those groups.” The key difference from classification is that
in classification you know what you are looking for. While that is not the case in
clustering.
Clustering is sometimes called unsupervised classification because it produces the
same result as classification does but without having predefined classes.
Supervised Learning Unsupervised Learning
The goal of supervised learning is to train The goal of unsupervised learning is to find
the model so that it can predict the output the hidden patterns and useful insights
when it is given new data. from the unknown dataset.
Supervised learning can be used for those Unsupervised learning can be used for
cases where we know the input as well as those cases where we have only input data
corresponding outputs. and no corresponding output data.
Supervised learning model produces an Unsupervised learning model may give less
accurate result. accurate result as compared to supervised
learning.
There is an input layer which has many sensors to collect data from the outside world. On the
right-hand side, we have an output layer that gives us the result predicted by the network.
In between these two, several layers are hidden. Each additional layer adds further
complexity in training the network but would provide better results in most of the situations.
There are several types of architectures designed which we will discuss now.
ANN Architecture
The diagram below shows several ANN architectures developed over a period and
are in practice today.
Each architecture is developed for a specific type of application. Thus, when you use a
neural network for your machine learning application, you will have to use either one
of the existing architecture or design your own. The type of application that you finally
decide upon depends on your application needs. There is no single guideline that tells
you to use a specific network architecture.
Deep Learning
Deep learning is a subset of machine learning . Deep learning algorithms define an
artificial neural network that is designed to learn the way the human brain learns.
Deep learning models require large amounts of data that pass through multiple
layers of calculations, applying weights and biases in each successive layer to
continually adjust and improve the outcomes.
Most deep learning methods use neural network architectures, which is why deep
learning models are often referred to as deep neural networks.
The term “deep” usually refers to the number of hidden layers in the neural
network. Traditional neural networks only contain 2-3 hidden layers, while deep
networks can have as many as 150.
Deep learning models are trained by using large sets of labelled data and neural
network architectures that learn features directly from the data without the need for
manual feature extraction.
Disadvantages
Some of the important points that you need to consider before using deep learning
are listed below:
Black Box approach
Duration of Development
Amount of Data
Computationally Expensive
Why this is called a black-box approach is that you do not know why the network came up
with a certain result. You do not know how the network concluded that it is a dog? Now
consider a banking application where the bank wants to decide the creditworthiness of a
client. The network will provide you an answer to this question. However, will you be able
to justify it to a client? Banks need to explain it to their customers why the loan is not
sanctioned?
Duration of development
The process of training a neural network is depicted in the diagram below-
You first define the problem that you want to solve, create a specification for it, decide
on the input features, design a network, deploy it, and test the output. If the output is
not as expected, take this as a feedback to restructure your network. This is an iterative
process and may require several iterations until the time network is fully trained to
produce desired outputs.
Amount of data
The deep learning networks usually require a huge amount of data for training, while
the traditional machine learning algorithms can be used with a great success even with
just a few thousands of data points. Fortunately, the data abundance is growing at 40%
per year and CPU processing power is growing at 20% per year as seen in the diagram
given below:
Computationally Expensive
Training a neural network requires several times more computational power than the one
required in running traditional algorithms. Successful training of deep Neural Networks may
require several weeks of training time.
In contrast to this, traditional machine learning algorithms take only a few minutes/hours to
train. Also, the amount of computational power needed for training deep neural network
heavily depends on the size of your data and how deep and complex the network is?
After having an overview of what Machine Learning is, its capabilities, limitations, and
applications, let us now dive into learning “Machine Learning”.
Reinforcement Learning
Reinforcements aims at using observations gathered from the interaction with the
environment to take actions that would maximize the reward or minimize the risk.
Reinforcement learning algorithm (called the agent) continuously learns from the
environment in an iterative fashion. In the process, the agent learns from its experiences of
Reinforcement Learning is a type of Machine Learning, and thereby also a branch of Artificial
Intelligence. It allows machines and software agents to automatically determine the ideal
behaviour within a specific context, to maximize its performance. Simple reward feedback is
required for the agent to learn its behaviour; this is known as the reinforcement signal.
Types of Reinforcement: There are two types of Reinforcement:
1.Positive –
Positive Reinforcement is defined as when an event, occurs due to a particular behaviour,
increases the strength and the frequency of the behaviour. In other words, it has a
positive effect on behaviour.
2.Negative –
Negative Reinforcement is defined as strengthening of a behaviour because a negative
condition is stopped or avoided.
The most common application of machine learning is Facial Recognition, and the simplest
example of this application is the iPhone X. There are a lot of use-cases of facial recognition,
mostly for security purposes like identifying criminals, searching for missing individuals, aid
forensic investigations, etc. Intelligent marketing, diagnose diseases, track attendance in
schools, are some other uses.
Abbreviated as ASR, automatic speech recognition is used to convert speech into digital
text. Its applications lie in authenticating users based on their voice and performing tasks
based on the human voice inputs. Speech patterns and vocabulary are fed into the system
to train the model. Presently ASR systems find a wide variety of applications in the following
domains:
Medical Assistance
Industrial Robotics
Forensic and Law enforcement
Defence & Aviation
Telecommunications Industry
Home Automation and Security Access Control
I.T. and Consumer Electronics
Financial Services
Machine learning has many use cases in Financial Services. Machine Learning algorithms
prove to be excellent at detecting frauds by monitoring activities of each user and assess
that if an attempted activity is typical of that user or not.
Financial monitoring to detect money laundering activities is also a critical security use case
of machine learning.
Machine Learning also helps in making better trading decisions with the help of algorithms
that can analyse thousands of data sources simultaneously. Credit scoring and underwriting
are some of the other applications.
The most common application in our day-to-day activities is the virtual personal assistants
like Siri and Alexa.
Machine Learning is improving lead scoring algorithms by including various parameters such
as website visits, emails opened, downloads, and clicks to score each lead. It also helps
businesses to improve their dynamic pricing models by using regression techniques to make
predictions.
Healthcare
A vital application of Machine Learning is in the diagnosis of diseases and ailments, which
are otherwise difficult to diagnose. Radiotherapy is also becoming better with Machine
Learning taking over.
Early-stage drug discovery is another crucial application which involves technologies such as
precision medicine and next-generation sequencing. Clinical trials cost a lot of time and
money to complete and deliver results. Applying Machine Learning based predictive
analytics could improve on these factors and give better results.
Machine Learning technologies are also critical to make outbreak predictions. Scientists
around the world are using these technologies to predict epidemic outbreaks.