You are on page 1of 19

UNIT- 1

Machine Learning-
Machine = A Computer/Program/Algorithm/Device
Learning = Getting knowledge/Understanding
Machine Learning = Computer Learning/Algorithm Learning

Definitions:
• “Machine Learning is the process in which a computer program gets knowledge to do a task by itself without
external programs.”
• “Machine learning is a subset of Artificial Intelligence (AI) in which computer software applications become
more accurate without (direct) explicitly programs.”
• “Machine learning enables a machine to automatically learn from data, improve performance from experience
and predict things without explicitly programmed.” —Arthur Samuel
• A computer program is said to “learn” from experience (E) w.r.t. some class of tasks (T) and performance
measure (P), if its performance (P) at tasks (T) as measured by (P), improves with experience (E). —Tom
Mitchell
• In the field of computer science, the machine learning is defined as a technique in which computer programs
automatically improve their performance through past experiences.

In simple words, when we fed the training data to a machine learning algorithm, then this algorithm is
improved/tuned to the training data. Now when test data (similar to training data) is applied then this improved
algorithm provides better results. In other words, we can say that machine learning algorithm learns from
training data. Then applies the learned knowledge on testing data. Thus machine becomes more intelligent.
These improved algorithms are called learned model. These learned models are used in real-life problem
solutions such a business, e-commerce websites, social media platforms, stock market predictions robotic car
driving etc.

Artificial Intelligence (AI):-


Artificial intelligence is the branch of computer science in which machines performs the tasks which require
human intelligence e.g., regression, classification and clustering etc.
Data Science:-
Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to
extract knowledge from data.
Deep Learning:-
Deep learning is a subset of machine learning which is based on Artificial Neural Networks (ANN) with three or
more layers.
Data Science:-
Data science is a field of study which uses scientific methods to extract meaning and insight from data. The
handling, managing, analyzing and understanding tends in data are tasks in data science.

Relation between Artificial Intelligence (AI) Machine Learning (ML) and Deep Learning (DL)
Difference Between Machine Learning and Deep Learning :-

Differentiate between data science and machine learning:-

NOTE-
“Training data set is used to build a machine learning model.
Test data set is used to evaluate (test) the ML model.”
Difference between Artificial Intelligence (AI) and Machine Learning (ML):-

 TRADITIONAL PROGRAMMING MODEL AND MACHINE LEARNING MODEL

 ESSENTIAL MATH FOR ML AND AI OR BASIC KNOWLEDGE REQUIRED


FOR A MACHINE LEARNING ENGINEER
The machine learning is a very good, high paying career. But it requires multiple domain knowledge
as:
1. Basic Algebra: Variables, coefficients and functions, Linear equations (such as y = b + w1 x1 + w2
x2), Logarithms, logarithmic equations (such as y = ln (1 + ex))
Sigmoid function.
2. Linear Algebra:
Tensor and Tensor rank
Matrix multiplication

3. Trigonometry:
tan h (used in activation function in ANN).

4. Probability and Statistics: Bays theorem, Basic probability definition, Conditional probability
mean, Median, Mode, Standard deviation, Outliers, Decision trees, Histogram.
5. Calculus (Optional for advanced topics): Concept of a derivative, Maxima and minima of a
function, Gradient or slope, Partial derivative, Chain rule (used in backpropagation rule in
ANN).
6. Boolean Algebra:
AND logic (Conjunction)
OR logic (Disjunction)
NOT logic (Negation)

7. Python Programming (Basic level only):


Defining and calling functions
For loop
if/else conditional blocks
String formatting
Basic data types (int, float, bool, str)

8. Basic knowledge of Programming Skills:


Study and Analysis of Algorithms Design
Data Analysis

9. Information Theory (Entropy, Information Gain): The entropy and information gain are
explained in chapter five in detail.

VARIOUS METHODS TO EVALUATE AN ML MODEL


The performance measure of a machine learning model or deep learning model are given
as follows:
1.17.1 For Classification Problems
1. Confusion matrix
2. Classification accuracy
3. Classification report
(i) Precision (ii) Recall or sensitivity
(iii) Specificity (iv) F1 score
4. Precision-Recall (PR) curve
5. Receiver Operating Characteristics (ROC) curve
6. PR vs ROC curve
1.17.2 For Regression Problems
1. Mean Absolute Error (MAE)
2. Mean Square Error (MSE)
3. R Squared (R2) metric
1. Confusion Matrix: A confusion matrix is the first and easiest method to measure the
performance of a classification problem. The confusion matrix is used when the
output can be two or more types of classes. The confusion matrix is a table with
two dimensions as “Actual” and “Predicted” value.
Some common terms used in confusion matrix are given below.
(i) True Positives (TP): In this case, predictive class and actual class both are positive
(1) or true.
(ii) True Negative (TN): In this case, predictive class and actual class both are
negative (0) or false.
(iii) False Positive (FP): In this case, predictive class is true (1), but actual class is
false (0).
(iv) False Negative (FN): In this case, predictive class is false (0), but actual class is
true (1).
True Positive False Negative
(TP) (FN)
False Positive True Negative
(FP) (TN)

ISSUES RELATED WITH MACHINE LEARNING ARE :-

1. Data quality :
a. It is essential to have good quality data to produce quality ML algorithms and models.
b. To get high-quality data, we must implement data evaluation, integration, exploration, and governance
techniques prior to developing ML models.
c. Accuracy of ML is driven by the quality of the data.
2. Transparency :
a. It is difficult to make definitive statements on how well a model is going to generalize in new environments.
3. Manpower :
a. Manpower means having data and being able to use it. This does not introduce bias into the model.
b. There should be enough skill sets in the organization for software development and data collection.
4. Other :
a. The most common issue with ML is people using it where it does not belong.
b. Every time there is some new innovation in ML, we see overzealous engineers trying to use it where it’s not
really necessary.
c. This used to happen a lot with deep learning and neural networks.
d. Traceability and reproduction of results are two main issues.

Some Important Issues in ML :-


1. What algorithm exists for learning general target function?
2. With sufficient training data, when the algorithm will converge?
3. Which algorithm performs best for which type of problems?
4. How much training data is sufficient?
5. When and how the prior knowledge of learner guide the generalizing process?
6. What is best strategy for choosing a useful training experience?
7. What specific function should the system attempt to learn?
8. How can the learner automatically change its represented for improvement?

 STEPS IN A MACHINE LEARNING PROCESS OR STEPS FOR MAKING ML MODEL :-


There are seven steps for making a machine learning model as listed below.
1. Gathering data/data collection
2. Preparing data/data preparation
3. Choosing a model (Algorithm)
4. Training of algorithm (Model training)
5. Evaluation (Model testing)
6. Hyper parameter tuning (Model improvement)
7. Prediction. (Use model in real life)

 APPLICATIONS OF MACHINE LEARNING :-

1. Image recognition :
a. Image recognition is the process of identifying and detecting an object or a feature in a digital image or video.
b. This is used in many applications like systems for factory automation, toll booth monitoring, and security
surveillance.
2. Speech recognition :
a. Speech Recognition (SR) is the translation of spoken words into text.
b. It is also known as Automatic Speech Recognition (ASR), computer speech recognition, or Speech To Text
(STT).
c. In speech recognition, a software application recognizes spoken words.
3. Medical diagnosis :
a. ML provides methods, techniques, and tools that can help in solving diagnostic and prognostic problems in a
variety of medical domains.
b. It is being used for the analysis of the importance of clinical parameters and their combinations for prognosis.
4. Statistical arbitrage :
a. In finance, statistical arbitrage refers to automated trading strategies that are typical of a short-term and
involve a large number of securities.
b. In such strategies, the user tries to implement a trading algorithm for a set of securities on the basis of
quantities such as historical correlations and general economic variables.
5. Learning associations :
Learning association is the process for discovering relations between variables in large data base.
6. Extraction :
a. Information Extraction (IE) is another application of machine learning.
b. It is the process of extracting structured information from unstructured data.

Advantages and Disadvantages of machine learning :-

Advantages of machine learning are :


1. Easily identifies trends and patterns :
a. Machine learning can review large volumes of data and discover specific trends and patterns that would not be
apparent to humans.
b. For an e-commerce website like Flipkart, it serves to understand the browsing behaviours and purchase
histories of its users to help cater to the right products, deals, and reminders relevant to them.
c. It uses the results to reveal relevant advertisements to them.
2. No human intervention needed (automation) :
a. Machine learning does not require physical force i.e., no human intervention is needed.
3. Continuous improvement :
a. ML algorithms gain experience, they keep improving in accuracy and efficiency.
b. As the amount of data keeps growing, algorithms learn to make accurate predictions faster.
4. Handling multi-dimensional and multi-variety data :
a. Machine learning algorithms are good at handling data that are multi-dimensional and multi-variety, and they
can do this in dynamic or uncertain environments.
Disadvantages of machine learning are :
1. Data acquisition :
a. Machine learning requires massive data sets to train on, and these should be inclusive/unbiased, and of good
quality.
2. Time and resources :
a. ML needs enough time to let the algorithms learn and develop enough to fulfill their purpose with a
considerable amount of accuracy and relevancy.
b. It also needs massive resources to function.
3. Interpretation of results :
a. To accurately interpret results generated by the algorithms. We must carefully choose the algorithms for our
purpose.
4. High error-susceptibility :
a. Machine learning is autonomous but highly susceptible to errors.
b. It takes time to recognize the source of the issue, and even longer to correct it.

 COMPONENTS OF A LEARNING PROCESS


The learning process of a human or a machine can be divided into four components as follow:
1. Data storage: Data is stored in hard disks, memories.
2. Abstraction: The process of extracting knowledge from stored data is called abstraction.
3. Generalization: Utilizing knowledge into useful action.
4. Evaluation: Feedback to improve performance.

 OBJECTIVE/GOAL OF MACHINE LEARNING :-


The primary purpose of machine learning is to discover patterns in data. After knowing the patterns the task is to
make predictions to solve the business problems.

 STEPS OF DEVELOPING OR DESIGNING OF A LEARNING SYSTEM The steps


of design a “Checker Game” learning system are given below:
1. Choosing Training Experience
2. Choosing Target Function
3. Choosing Representation for Target Function
4. Choosing Function Approximation Algorithm
5. Final Design
Step 1: Choose the Training Experience The first step is to choose the training data (experience) which will be
fed to the machine learning algorithm. The training data decides the failure or success of the model. Therefore,
training data should be chosen wisely.
Step 2: Choose the Target Function In this step, we determine what type of knowledge will be learned by the
machine. Then after the training, how the performance of program will be checked.
For example, in checkers game learning program development, we should know the legal move from any board
state. After knowing all legal moves from all states, the next step is to choose best move from all legal states.
This target function is called as “Choose Best Move”.
Choose Best Move : B → M
where, B = All Board States
M = Moved State (Best) possible Now choose the next target function as:
V:B→R
where, V = This target function ‘V’ maps and legal board state from B state to R state
R = A set of real numbers.
Step 3: Choose a Representation for Target Function After knowing best possible move at each stage, the next
step is to represent the target function. The target function can be represented using a linear equation, a
hierarchical graph, tabular form etc.
For example, in checker game learning, the objective function can be represented as: V( ) ˆ b = w0 + w1x1 +
w2x2 + w3x3 ...... w6x6 …(1.2)
where, w0 – w6 = Weights of learning algorithms or numerical coefficients
x1 – x6 = Attributes of black and red pieces placed on the checkers board.
Step 4: Choose a Function Approximation Algorithm After deciding the target function, the next step is how to
teach this target to the machine. In order to learn the target function, we require a set of training examples. Each
example specifies a board state (b) and a training value Vtrain(b).
Step 5: Final Design The final model is designed at last, when system observes a lot of training examples and
decisions.
For example, the final design of checkers play game is given below.
1. Task (T) = Play checkers game.
2. Performance measure (P) = Percentage (%) of game won.
3. Training experience (E) = Play game against itself.
4. Target function (V) = B → R = Maps board movement to real no. R.
5. Target function representation: V( ) ^b = w0 + w1x1 + w2x2 + w3x3 ...... w6x6 …(1.3)

The final design of checker game learning is explained as below:


• Performance System is the module which solves the given performance tasks (such as playing checkers) by
using the learned target function.
• The Critic takes input as history (or trace) of each steps of game. The critic gives output as training examples of
target function.
• The Generalize takes input as training examples and produces an output Hypothesis which is an estimate of
target function.
• The Experiment Generator takes input as the current hypothesis and gives output a new problem (i.e., a board
state) to explore.

 WELL-DEFINED LEARNING PROBLEMS:-


A well-defined learning problem is easy to define and model. Main features (attributes) of a well-defined
learning problem:
There are mainly three features of a well-defined learning problems.
• Class of Task (T)
• Performance Measure (P)
• Training Experience (E)

The following are the examples of some well-defined learning problems:-


1. Checkers (Chess) game learning problem.
2. Handwriting recognition learning problem.
3. Robot car driving learning problem.
4. Spoken words recognition learning problem.
5. New astronomical structure learning problem.

Checkers Game Learning Problem


1. Class of Task (T) = Playing Checker (How to play)
2. Performance Measure (P) = % of game won against opponent
3. Training Experience (E) = Playing practice games against itself.

Handwriting Recognition Learning Problem


For example, identify a handwritten character “6” using image machine learning and image processing as:
• Task (T) = Recognising and classifying handwritten words from given images.
• Performance Measure (P) = Percentage (%) of words correctly classified.
• Training Experience (E) = A database of images of handwritten words.

Robot Car Driving Learning Problem


• Task (T) = Driving on a 4-lane highway using vision sensors.
• Performance Measure (P) = Average distance travelled before an error.
• Training Experience (E) = A sequence of images and steering commands recorded while observing a human
driver.

 TYPES OF MACHINE LEARNING :-


Generally, there are three types of machine learning.
1. Supervised learning
2. Un-supervised learning
3. Reinforcement learning

1. Supervised Learning:-
In supervised learning, the models are trained using labelled data. The model needs to find the mapping
function (f) between input data(x) and output data(y) i.e., y = f(x) …(1.1) Supervised learning needs
supervision to train the model. It is similar to learning of a student in the presence of a teacher in the classroom.
Example: Let us consider a basket which consists of some fruits i.e., Apple, Banana, Mango, Grapes etc. The
task of ML model is to identify the fruits and classifiy them accordingly. To identify type of fruits in supervisor
learning, we provide shape, size, colour and taste of each fruit to the model. This is called “model training” or
“training of model.” Once, training is completed, we test the model by giving the new set of fruits to
identify.Then model will identify the fruits.
OPERATION OF SUPERVISED LEARNING
In supervised learning, models are trained using labelled dataset. The model learns
about each type of data set. Once the training step is completed, then model is tested to
predict the correct output.
Step 1. Training Step
Let us consider a dataset of different types of object shapes which includes triangle,
square and polygon etc. Now our first step is to train the model for each object shape.
if shape has three equal sides, then it is labelled as a “Triangle”.
if shape has four equal sides, then it is labelled as a “Square”.
if the shape has six equal sides, then it is labelled as a “Hexagon”.

Step 2. Testing Step


Now after our model is trained to identify new shapes (which are similar to training
data). So to test the learnt ML model, now we apply any one shape (say triangle) at input
and see whether the model predicts the correct output or not (i.e., Output = Triangle).
• Detailed Steps Involved in Supervised Learning
Step 1. First determine type of training dataset.
Step 2. Collect the labelled training dataset.
Step 3. Split the dataset into two parts as training data and test data.
Step 4. Determine the features of training dataset. These features should completely
explain the model to predict the output.
Step 5. Find a basic standard algorithm which is to be trained e.g., a decision tree
algorithm, support vector machine (SVM) algorithm etc.
Step 6. (Training Step) Apply the training data to the standard algorithm. The training
data consists of inputs labelled (paired) or tagged with correct outputs.
Step 7. (Testing Step) Check (evaluate) the accuracy of the trained model by applying
the testing data. If the model predicts the correct output, then our ML model is
accurate and ML process is completed successfully.
Step 8. If the predicted output of model is not accurate, then repeat step 6 training
data again until the model accuracy is improved.

TYPES OF SUPERVISED LEARNING ALGORITHM

Supervised learning can be divided into following two types as classification and
regression.
CLASSIFICATION (DISCRETE SENSE)
In classification problems, the output data is in the form of categories (categorical data).
Therefore classification algorithms are used to output data in the form of two classes e.g.,
Yes-No, Male-Female, True-False etc.
The task of classification algorithm is to map input (X) with discrete output (Y)
variable.

The most common classification algorithm are given as below.


1. Random Forest Algorithm
2. Decision Tree Algorithm
3. Support Vector Machines (SVMs) Classifiers
4. Artificial Neural Networks (ANNs) Classifiers
5. Naive Bays Classifiers
Real-Life Classification Model Problems or Application of Supervised Learning to
Solve Classification Model Problems
The machine learning classification models are used to solve a wide range of business
problems as given below:
1. To classify an incoming email is spam or not. (Binary classification problem)
2. To classify different types of fruits
3. To classify different types of documents
4. To classify different types of images
5. To classify different types of customers and their behaviours

2. Unsupervised Learning :- In supervised learning, both input data (x) and output data (y) (previous) are
provided to train the model. But in unsupervised learning, only input data is provided to the model. In
unsupervised learning, the models are trained using unlabelled data. Here, only input data(x) is provided to train
the model. The goal of unsupervised learning is to find the hidden patterns in input data. The unsupervised
learning methods are suitable when the output variables (i.e., the lables) are not provided.
For example, the clustering task of customer segmentation, product recommendation, friends suggestions on
social media platforms

• Type of Unsupervised Learning:


There are two types of unsupervised learning:
1. Clustering
2. Association

CLUSTER ANALYSIS (CLUSTERING TECHNIQUE)


In clustering technique, unlabelled data is grouped based on similarities such as shape,
size, colour price etc. Clustering is the process in which we create groups in a data, like
customers, products, documents, employees etc.

TYPES OF CLUSTERING
1. Centroid-based (partitioning) clustering
2. Density based clustering (model base)
3. Hierarchical clustering (connectivity based)
4. K-means clustering algorithm
Clustering is the task of partitioning the data set into groups, called as clusters.
The goal is to split the data in such a way that points within a single cluster are very
similar to each other.
3.5.1 Centroid Based Clustering (Partitioning)
In this type of clustering, the data is divided into non-hierarchial groups. It is also
called partition clustering. The most common example of centroid based clustering is
the K-mean clustering algorithm.
In K-means clustering, the data set is divided into a set of K-groups. Centroid
based algorithms are efficient but sensitive to initial condition and outliers.

Fig. 3.3. Clusters of data in centroid based clustering

3.5.2 Density based Clustering


In this clustering, the areas of high data density are grouped into clusters. Thus
arbitrary shaped clusters are formed as long as dense region can be connected. These
algorithms can face difficulty if data set is changing densities and high dimensions.
Fig. 3.4. Density based clustering

3.5.3 Hierarchical Clustering


In this clustering, the data set is divided into clusters to create a tree like structure.
This tree structure is called dendrogram. Any number of cluster can be selected by
cutting the tree at correct level. The most common example of hierarchical clustering
is agglomerative algorithm.

Fig. 3.5. Hierarchical clustering

3.5.4 K-means Clustering


The K-means clustering is simple and most commonly used clustering algorithm. It finds
cluster centres which represent certain regions data. The K-means clustering algorithm
starts with a some randomly selected data points (called centroids). These centroid
points are used as the beginning points for every cluster. Then iterative (repeated)
calculations are used to optimize the positions of the centroids.
The K-means clustering stops optimizing clusters in two cases.
1. If centroids have stabilized.
2. If defined number of iterations achieved.

Before K-means After K-means

3.5.4.1 Working of K-means Algorithm


Step 1. First, specify the number of clusters (K) need to be generated by this algorithm.
Step 2. Next, randomly select (K) data points and assign each data point to a cluster.
In simple words, classify the data based on the number of data points.
Step 3. Now, compute the cluster centroids.
Step 4. Next, keep iterating the following until we find optimal centroid which is the
assignment of data points to the clusters that are not changing any more.
 First, compute the With in Cluster Sum of Squared distance (WCSS).
 Assign each data point to the cluster that is closer than other cluster
(centroid).
 At last compute the centroids for the clusters by taking the average of all
data points of that cluster.
3.5.4.2 Steps for K-means Clustering Algorithm
Step 1. Select the number of clusters (K) to be generated by this algorithm.
Step 2. Select the number of centroid (K) for each cluster.
Step 3. Calculate the Euclidean (or Manhattan) distance. And Assign the points to
the nearest centroids. Thus K groups are created.
Step 4. Now find the original centroid in each group.
Step 5. Again reassign the whole data points based on this new centroid. Then repeat
step 4 until the position of centroid does not change.

APPLICATIONS OF CLUSTERING TECHNIQUES IN MACHINELEARNING

1. Identification of cancer cells: The clustering algorithms are widely used


for identification of cancer cells. It divides cancer data and non-cancer data
set into different groups.
2. In search engines on internet: The search engines like Google, Yahoo work
on the base of clustering technique. The search results appear which are based
on closest to the search query.
3. Customer segmentation: It is used in market research to segment the
customers based on their choice and preferences.
In biology: It is used in biology to classify and group different species of plants and animals using
image recognition technique.
3.Reinforcement Learning
“The reinforcement learning is a type of machine learning method in which the desired behaviour is
rewarded and undesired behaviour is punished.”
In this learning, a software or hardware agent is used. This agent interacts with the external
environment by taking some actions. Depending upon actions, some reward or penalty are given to the
agent. Then agent learns from its rewards and penalty and improves its next actions. In this way, the
machine learns to improve itself with experience.

Some Important Terms in Reinforcement Learning


• Environment = Physical world
• Agent = A hardware (Robot) or software (Algorithm)
• State = Present situation
• Reward = Feedback from environment
• Policy = A method to map agents state to actions
• Value function = Future rewards

Reinforcement Learning Algorithm


Step 1: Prepare an agent with initial strategies.
Step 2: Observe environment and current state.
Step 3: Select optimal policy for current state and take suitable actions.
Step 4: New agent gets rewards/penalty for the action performed in Step 3.
Step 5: Now, update initial strategies depending upon received rewards/penalties.
Step 6: Repeat Step 2 and Step 5 until agent learns completely/optimally.

Real Life Examples of Reinforcement Learning


• Trajectory optimization
• Robotic movement control
• Dynamic path control
• Controller optimization
• Autonomous self driving cars lane changing, car parking etc.
• Industrial automation
• Trading and finance
• Natural Language Processing (NLP)
• Healthcare chronic diseases
• News recommendation
• Gaming applications
• Marketing and advertisements

PERFORMANCE MEASUREMENT OF A MACHINE LEARNING


ALGORITHM
Confusion Matrix in Machine Learning

The confusion matrix is a matrix used to determine the performance of the classification models for a given
set of test data. It can only be determined if the true values for test data are known. The matrix itself can be
easily understood, but the related terminologies may be confusing. Since it shows the errors in the model
performance in the form of a matrix, hence also known as an error matrix. Some features of Confusion
matrix are given below:
o For the 2 prediction classes of classifiers, the matrix is of 2*2 table, for 3 classes, it is 3*3 table,
and so on.
o The matrix is divided into two dimensions, that are predicted values and actual values along with
the total number of predictions.
o Predicted values are those values, which are predicted by the model, and actual values are the true
values for the given observations.
o It looks like the below table:

The above table has the following cases:

o True Negative: Model has given prediction No, and the real or actual value was also No.
o True Positive: The model has predicted yes, and the actual value was also true.
o False Negative: The model has predicted no, but the actual value was Yes, it is also called
as Type-II error.
o False Positive: The model has predicted Yes, but the actual value was No. It is also called a Type-
I error.

Need for Confusion Matrix in Machine learning


o It evaluates the performance of the classification models, when they make predictions on test data,
and tells how good our classification model is.
o It not only tells the error made by the classifiers but also the type of errors such as it is either type-I
or type-II error.
o With the help of the confusion matrix, we can calculate the different parameters for the model,
such as accuracy, precision, etc.

Example: We can understand the confusion matrix using an example.

Suppose we are trying to create a model that can predict the result for the disease that is either a person has
that disease or not. So, the confusion matrix for this is given as:

From the above example, we can conclude that:


o The table is given for the two-class classifier, which has two predictions "Yes" and "NO." Here,
Yes defines that patient has the disease, and No defines that patient does not has that disease.
o The classifier has made a total of 100 predictions. Out of 100 predictions, 89 are true predictions,
and 11 are incorrect predictions.
o The model has given prediction "yes" for 32 times, and "No" for 68 times. Whereas the actual
"Yes" was 27, and actual "No" was 73 times.

Calculations using Confusion Matrix:


We can perform various calculations for the model, such as the model's accuracy, using this matrix. These
calculations are given below:

o Classification Accuracy: It is one of the important parameters to determine the accuracy of the
classification problems. It defines how often the model predicts the correct output. It can be
calculated as the ratio of the number of correct predictions made by the classifier to all number of
predictions made by the classifiers. The formula is given below:

o Misclassification rate: It is also termed as Error rate, and it defines how often the model gives the
wrong predictions. The value of error rate can be calculated as the number of incorrect predictions
to all number of the predictions made by the classifier. The formula is given below:

o Precision: It can be defined as the number of correct outputs provided by the model or out of all
positive classes that have predicted correctly by the model, how many of them were actually true. It
can be calculated using the below formula:

o Recall: It is defined as the out of total positive classes, how our model predicted correctly. The
recall must be as high as possible.

o F-measure: If two models have low precision and high recall or vice versa, it is difficult to
compare these models. So, for this purpose, we can use F-score. This score helps us to evaluate the
recall and precision at the same time. The F-score is maximum if the recall is equal to the precision.
It can be calculated using the below formula:

Other important terms used in Confusion Matrix:

o Null Error rate: It defines how often our model would be incorrect if it always predicted the
majority class. As per the accuracy paradox, it is said that "the best classifier has a higher error rate
than the null error rate."
o ROC Curve: The ROC is a graph displaying a classifier's performance for all possible thresholds.
The graph is plotted between the true positive rate (on the Y-axis) and the false Positive rate (on
the x-axis).

History of Machine Learning


Before some years (about 40-50 years), machine learning was science fiction, but today it is the part of our
daily life. Machine learning is making our day to day life easy from self-driving cars to Amazon virtual
assistant "Alexa". However, the idea behind machine learning is so old and has a long history. Below some
milestones are given which have occurred in the history of machine learning:

The early history of Machine Learning (Pre-1940):

o 1834: In 1834, Charles Babbage, the father of the computer, conceived a device that could be
programmed with punch cards. However, the machine was never built, but all modern computers
rely on its logical structure.
o 1936: In 1936, Alan Turing gave a theory that how a machine can determine and execute a set of
instructions.

The era of stored program computers:

o 1940: In 1940, the first manually operated computer, "ENIAC" was invented, which was the first
electronic general-purpose computer. After that stored program computer such as EDSAC in 1949
and EDVAC in 1951 were invented.
o 1943: In 1943, a human neural network was modeled with an electrical circuit. In 1950, the
scientists started applying their idea to work and analyzed how human neurons might work.

Computer machinery and intelligence:

o 1950: In 1950, Alan Turing published a seminal paper, "Computer Machinery and Intelligence,"
on the topic of artificial intelligence. In his paper, he asked, "Can machines think?"

Machine intelligence in Games:

o 1952: Arthur Samuel, who was the pioneer of machine learning, created a program that helped an
IBM computer to play a checkers game. It performed better more it played.
o 1959: In 1959, the term "Machine Learning" was first coined by Arthur Samuel.

The first "AI" winter:

o The duration of 1974 to 1980 was the tough time for AI and ML researchers, and this duration was
called as AI winter.
o In this duration, failure of machine translation occurred, and people had reduced their interest from
AI, which led to reduced funding by the government to the researches.

Machine Learning from theory to reality

o 1959: In 1959, the first neural network was applied to a real-world problem to remove echoes over
phone lines using an adaptive filter.
o 1985: In 1985, Terry Sejnowski and Charles Rosenberg invented a neural network NETtalk, which
was able to teach itself how to correctly pronounce 20,000 words in one week.
o 1997: The IBM's Deep blue intelligent computer won the chess game against the chess expert
Garry Kasparov, and it became the first computer which had beaten a human chess expert.

Machine Learning at 21st century

o 2006: In the year 2006, computer scientist Geoffrey Hinton has given a new name to neural net
research as "deep learning," and nowadays, it has become one of the most trending technologies.
o 2012: In 2012, Google created a deep neural network which learned to recognize the image of
humans and cats in YouTube videos.
o 2014: In 2014, the Chabot "Eugen Goostman" cleared the Turing Test. It was the first Chabot who
convinced the 33% of human judges that it was not a machine.
o 2014: DeepFace was a deep neural network created by Facebook, and they claimed that it could
recognize a person with the same precision as a human can do.
o 2016: AlphaGo beat the world's number second player Lee sedol at Go game. In 2017 it beat the
number one player of this game Ke Jie.
o 2017: In 2017, the Alphabet's Jigsaw team built an intelligent system that was able to learn
the online trolling. It used to read millions of comments of different websites to learn to stop
online trolling.

Machine Learning at present:

Now machine learning has got a great advancement in its research, and it is present everywhere around us,
such as self-driving cars, Amazon Alexa, Catboats, recommender system, and many more. It
includes Supervised, unsupervised, and reinforcement learning with clustering, classification, decision
tree, SVM algorithms, etc.

Modern machine learning models can be used for making various predictions, including weather
prediction, disease prediction, stock market analysis, etc.

You might also like