Minor Report

A
Seminar Report
On
“BIOMETRIC TRAITS EVALUATION:

MACHINE LEARNING”
Submitted in partial fulfilment for the award of degree of

BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE ENGINEERING
Submitted To Submitted By
Mr. Loveleen Kumar Harshit Kapoor(16EGJCS073)
Assistant Professor Kapil Kokcha (16EGJCS088)
Lakshya Sharma (16EGJCS100)
DEPARTMENT OF COMPUTER SCIENCE

ENGINEERING
GLOBAL INSTITUTE OF TECHNOLOGY
JAIPUR (RAJASTHAN)-302022 2019-20
Candidate’s Declaration
We hereby declare that the work, which is being presented in the minor project ,
entitled “Biometric Traits Evaluation” in partial fulfillment for the award of
Degree of “Bachelor of Technology” in Department of Computer Science and
Engineering submitted to the Department of Computer Science &
Engineering, Global Institute of Technology, Rajasthan Technical University is
a record of my own investigations carried under the Guidance of Shri Loveleen
Kumar, Department of Computer Science and Engineering, Global Institute of
Technology.
We have not submitted the matter presented in this project anywhere for the
award of any other Degree.
Harshit Kapoor (16EGJCS073)
Kapil Kokcha (16EGJCS088)
Lakshya Sharma(15EGJCS100)
Global Institute of Technology, Jaipur
Counter Signed by
Loveleen Kumar Assistant Professor
Department of Computer Science and
Engineering Global Institute of
Technology, Jaipur
ii
ACKNOWLEDGEMENT
It is our pleasure to be indebted to various people, who directly or indirectly
contributed in the development of this work and who influenced my thinking,
behavior, and acts during the course of completion. This formal piece of
acknowledgement is an attempt to express the feeling of gratitude towards people
who help us in successful completion of my project.
We would like to express our deepest gratitude to Mr. Loveleen Kumar, our
project guide for his guidance, precious time and necessary advices. He was
always there with his competent support and valuable suggestion throughout the
development phase of the project.
A special thanks goes out to the HEAD OF DEPARTMENT (Computer Science

& Engineering) Mr. Girraj Khandelwal who has always given a patient hearing
to all out doubts and provided useful suggestions for completing the project and
report.
We would also like to thanks the supporting staff of Department of CSE (GIT)
and our parents who always supported us at every step, guided us, inspired us,
and provided us all facilities so that we can achieve our goals.
Harshit Kapoor
(16EGJCS073)
Kapil Kokcha (16EGJCS088)
Lakshya Sharma(16EGJCS100)
IV Year,
VII Semester Computer
Science & Engineering
Global Institute of
Technology, Jaipur
iii
ABSTRACT
“Biometric Traits Evaluation” was built with an objective to study and identify the
best algorithm to work on with the up growing industry 4.0, which helps us to work
efficiently and accurately. The project is built with 4 different algorithms (1 inbuilt, 3
studied). The complete project inbuilt various technologies like machine learning,
python, etc. The users are also provided with other services in order to help them
choose algorithm for their work. The user can compare the algorithms according to
his need on already present data set or using his own personal data set. The system is
built using different machine learning algorithm which makes the system highly
reliable. According to their requirements user can compare and check the algorithms.
The algorithms are compared on the basis of their accuracy and various other factors.
Not only on faces, we can work on different Biometrics like fingerprints or eye retina.
In all it can be concluded that this comparison and work on different algorithms can
help users to save much time, money and efforts, also provide a feeling of secure and
satisfaction.
The Machine Learning field evolved from the large area of Artificial Intelligence,
which pursuits to mimic smart skills of people by means of machines. In the sphere of
Machine Learning one considers the crucial question of the way to make machines
capable of “study”. Learning on this context is thought as inductive inference, in
which one observes examples that represent incomplete records approximately a few
“statistical phenomenon”. In unsupervised studying one generally attempts to discover
hidden regularities (e.G. Clusters) or to hit upon anomalies within the records (as an
instance some unusual gadget feature or a community intrusion). In supervised
gaining knowledge of, there is a label related to each instance. It is meant to be the
solution to a query approximately the instance. If the label is discrete, then the
challenge is known as class hassle – in any other case, for real valued labels we
communicate of a regression problem. Based on these examples (consisting of the
labels), one is in particular interested to expect the solution for different instances
iv
before they are explicitly discovered. Hence, mastering isn't always simplest a
question of remembering however also of generalization to unseen instances.
Machine Learning research has been extremely energetic the last few years. The result
is a huge variety of very correct and efficient algorithms that are quite easy to use for
a practitioner. It seems worthwhile and nearly mandatory for (laptop) scientist and
engineers to learn the way and where Machine Learning can help to automate duties
or offer predictions where human beings have problems to recognise huge amounts of
facts. The long listing of examples wherein Machine Learning techniques have been
successfully carried out consists of: Text category and categorization (as an example
junk mail filtering), network intrusion detection, Bioinformatics (e.G. Most cancers
tissue classification, gene locating, tracking of electric appliances, optimization of
tough disk caching strategies, high-energy physics particle class, reputation of hand
writing, natural scene evaluation and so forth. Some different algorithms encompass
regression algorithms (e.g. Ridge regression, regression bushes), unsupervised
studying algorithms (along with clustering, precept aspect analysis), reinforcement
getting to know, on line studying algorithms or model-choice issues. Some of those
strategies extend the applicability of Machine Learning algorithms extensively and
would every require an advent for them self.
v
TABLE OF CONTENTS
TOPICS PAGE
NO.
ACKNOWLEDGEMENT ii
CANDIDATE DECLARATION iii
ABSTRACT iv
LIST OF CONTENT vi
LIST OF FIGURES vii
LIST OF TABLES viii
CHAPTER 1: INTRODUCTION 1
1.1 Differences between Data Mining,
Machine Learning and Deep Learning? 2
1.2 Need of Machine Learning 3
1.3 Goals of Machine Learning 3
1.4 Why the goals of ML are important or desirable 4
1.5 Uses of Machine Learning 4
CHAPTER 2: MACHINE LEARNING WORKFLOW 6
2.1 Data Processing 7
2.2 Training Sets creation 8
2.3 Machine Learning Algorithms testing, evaluation and selection 8
2.4 Deployment and A/B testing 8
CHAPTER 3: APPLICATIONS OF MACHINE LEARNING 10
3.1 Uses of machine learning 10
3.2 Prominent Sectors the usage of ML 11
CHAPTER 4: METHODS OF CLASSIFICATION 13
4.1 Common Techniques in Data Classification 16
CHAPTER 5: CLASSIFIERS AND MODELS 19
5.1 Decision Tree based methods 19
5.2 Linear regression based methods 21
vi
5.3 Neural Network 22
5.4 Bayesian Network 23
5.5 Support Vector Machine 24
5.6 Nearest Neighbor 25
CHAPTER 6: MODELS OF MACHINE LEARNING 27

6.1 Definition of the problem 27
6.2 Face Recognition
6.2.1 Facial recognition system 27
6.2.2 The History of Face Recognition 28
6.2.3 Current Scenario 28
6.2.4 Proposed System 29
6.2.5 Features of Proposed System 29
6.3 Project Scope 29
6.4 Future Scope 30
6.5 Organization of the Report 30
CHAPTER 7: FACE DETECTION & FACE RECOGNITION 31
7.1. Introduction 31
7.2 Face Detection using Haar-Cascades 31
7.3. Face Recognition 32
7.3.1 Eigen face
7.3.2 Fisherface
7.3.3 Local Binary Pattern Histogram
7.4. Methodology 35
7.4.1 Face Detection 40
7.4.2 Face Recognition Process 41
CHAPTER 8: CONCLUSION 45
REFERENCES 47
vii
List of figure
Fig No. Name of Figures Page No.
1 2.1: Machine learning workflow 9

2 4.1: Feature selection process 17
3 5.1: Feature selection process 20
4 5.2: Linear Regression 22
5 5.3: Neural Network 23
6 5.4: Bayesian Network 24
7 5.5: Nearest Neighbor 25
viii
MODELS AND CLASSIFIERS: MACHINE LEARNING GIT DCSE
CHAPTER I
INTRODUCTION
Because of new computing technologies, machine learning today is not like machine
learning of the past. It was born from pattern recognition and the theory that
computers can learn without being programmed to perform specific tasks; researchers
interested in artificial intelligence wanted to see if computers could learn from data.
The iterative aspect of machine learning is important because as models are exposed
to new data, they are able to independently adapt. They learn from previous
computations to produce reliable, repeatable decisions and results. It’s a science that’s
not new – but one that has gained fresh momentum.
While many machine learning algorithms have been around for a long time, the ability
to automatically apply complex mathematical calculations to big data – over and over,
faster and faster – is a recent development. Here are a few widely publicized examples
of machine learning applications you may be familiar with:
• The heavily hyped, self-driving Google car? The essence of machine learning.
• Online recommendation offers such as those from Amazon and Netflix? Machine
learning applications for everyday life.
• Knowing what customers are saying about you on Twitter? Machine learning
combined with linguistic rule creation.
• Fraud detection? One of the more obvious, important uses in our world today.
Machine learning is a method of data analysis that automates analytical model

building. It is a branch of artificial intelligence based on the idea that systems can
learn from data, identify patterns and make decisions with minimal human
intervention.
Machine gaining knowledge of is a subfield of artificial intelligence (AI). The goal of

system mastering typically is to apprehend the structure of data and in shape that
information into models that can be understood and used by humans. Although
Session 2015-2019 Page 1

machine gaining knowledge of is a field within laptop science, it varies from

traditional computational methods. In traditional computing, algorithms are units of
explicitly programmed commands utilized by computers to calculate or resolve hassle.
Machine studying algorithms in its location permit for computers to educate on facts
inputs and use statistical exam with the intention to output values that fall inside a
particular range. Because of this, gadget mastering assists computer systems in
constructing fashions from sample statistics if you want to automate selection-making
methods based on facts inputs.
Machine Learning is an idea to examine from examples and enjoy, without being
explicitly programmed. Instead of writing code, you feed data to the generic
algorithm, and it constructs good judgment primarily based at the facts given.
1.1 Differences between Data Mining, Machine Learning and Deep

Learning?
Although all of these methods have the same goal – to extract insights, patterns and
relationships that can be used to make decisions – they have different approaches and
abilities.
Data Mining
Data mining can be considered a superset of many different methods to extract

insights from data. It might involve traditional statistical methods and machine
learning. Data mining applies methods from many different areas to identify
previously unknown patterns from data. This can include statistical algorithms,
machine learning, text analytics, time series analysis and other areas of analytics. Data
mining also includes the study and practice of data storage and data manipulation.
Machine Learning
The main difference with machine learning is that just like statistical models, the goal
is to understand the structure of the data – fit theoretical distributions to the data that
are well understood. So, with statistical models there is a theory behind the model that
is mathematically proven, but this requires that data meets certain strong assumptions

too. Machine learning has developed based on the ability to use computers to probe
the data for structure, even if we do not have a theory of what that structure looks like.
The test for a machine learning model is a validation error on new data, not a
theoretical test that proves a null hypothesis. Because machine learning often uses an
iterative approach to learn from data, the learning can be easily automated. Passes are
run through the data until a robust pattern is found.
Deep learning
Deep learning combines advances in computing power and special types of neural
networks to learn complicated patterns in large amounts of data. Deep learning
techniques are currently state of the art for identifying objects in images and words in
sounds. Researchers are now looking to apply these successes in pattern recognition to
more complex tasks such as automatic language translation, medical diagnoses and
numerous other important social and business problems.
1.2 Need of Machine Learning
Machine Learning is a department which is raised out of Artificial Intelligence (AI).

Applying AI, we wanted to construct advanced or shrewd machines. But except for
few mere responsibilities which include locating the shortest path between point A
and B, we have been unable to program greater difficult and consistently evolving
challenges. There turned into a realization that the simplest manner as a way to obtain
this assignment become to allow machine research from itself. It seems like a baby
studying from itself.
So system learning changed into evolved as a new functionality for computer systems.
And now gadget studying is found in so many divisions of generation, that we don’t
even realize it at the same time as the usage of it.

1.3 Goals of Machine Learning
The purpose of ML, in simples words, is to apprehend the character of (human and
other kinds of) learning, and to shape getting to know functionality in computers. To
be more precise, there are 3 capabilities of the desires of ML.
• To make the computer systems smarter, extra wise. The more direct goal in this
factor is to increase systems (applications) for precise realistic gaining knowledge
of tasks in application domain names.
• To cultivate computational models of human getting to know method and put into
effect laptop simulations.
• To explore new gaining knowledge of methods and increase preferred learning

algorithms unbiased of applications.
1.4 Why the goals of ML are important or desirable
Machine studying has several practical uses that force the kind of actual enterprise
outcomes – consisting of money and time financial savings – which have the potential
to dramatically have an effect on the future of your organization. Researches or
surveys in particular show amazing impact occurring within the purchaser care
industry, whereby device studying is letting people to get things executed greater
speedy and efficaciously.
Through Virtual Assistant answers, machine getting to know automates

responsibilities that would else need to be achieved through a live agent – such as
changing a password or analyzing an account balance. This frees up precious agent
time that may be used to pay interest on the type of purchaser care that humans
perform pleasant: excessive contact, complicated decision-making that isn't as easily
treated via a device.

1.5 Uses of Machine Learning
Conventionally, information analysis became usually being characterized by way of

trial and errors, a method that becomes hard when information units are large and
heterogeneous. Machine learning comes as the decision to all this chaos by means of
recommending cunning alternatives to investigate massive volumes of records. By
developing speedy and green algorithms and information-driven models for real-time
processing of data, device learning is capable to supply specific consequences and
analysis.
Machine gaining knowledge of has been used for photograph, video, and textual
content reputation, as well as serving because the energy behind recommendation
engines. Today, it's being used to strengthen cyber security, ensure public safety, and
improve medical outcomes. It can also assist increase customer service and make cars
safer.
To better understand the uses of machine studying, keep in mind a number of the
instances wherein gadget studying is carried out: the self-riding Google vehicle, cyber
fraud detection, online recommendation engines—like pal guidelines on Facebook,
Netflix showcasing the movies and indicates you might like, and “more gadgets to
recall” and “get yourself a touch something” on Amazon—are all examples of carried
out gadget learning. All those examples without a doubt country the essential position
system getting to know has started to take in these days data-wealthy international.
Machines can help in filtering beneficial portions of information that assist in
principal tendencies, and we're already seeing how this generation is being carried out
in a huge range of industries.

CHAPTER II
MACHINE LEARNING WORKFLOW
Machine gaining knowledge of uses algorithms that get educated from records to help
make greater decisions; however, it is not always apparent that's the excellent gadget
learning algorithm going to be for a positive hassle. Luckily, records which include
variable importance and model evaluation tools can help us pick out which machine
studying strategies to use.
Machine learning algorithms are often categorized as supervised or unsupervised.

Supervised algorithms require a data scientist or data analyst with machine learning
skills to provide both input and desired output, in addition to furnishing feedback
about the accuracy of predictions during algorithm training. Data scientists determine
which variables, or features, the model should analyze and use to develop predictions.
Once training is complete, the algorithm will apply what was learned to new data.
Some machine learning methods
Machine learning algorithms are often categorized as supervised or unsupervised.
• Supervised machine learning algorithms can apply what has been learned in the
past to new data using labeled examples to predict future events. Starting from the
analysis of a known training dataset, the learning algorithm produces an inferred
function to make predictions about the output values. The system is able to
provide targets for any new input after sufficient training. The learning algorithm

can also compare its output with the correct, intended output and find errors in
order to modify the model accordingly.
• In contrast, unsupervised machine learning algorithms are used when the

information used to train is neither classified nor labeled. Unsupervised learning
studies how systems can infer a function to describe a hidden structure from
unlabeled data. The system doesn’t figure out the right output, but it explores the
data and can draw inferences from datasets to describe hidden structures from
unlabeled data.
• Semi-supervised machine learning algorithms fall somewhere in between

supervised and unsupervised learning, since they use both labeled and unlabeled
data for training – typically a small amount of labeled data and a large amount of
unlabeled data. The systems that use this method are able to considerably improve
learning accuracy. Usually, semi-supervised learning is chosen when the acquired
labeled data requires skilled and relevant resources in order to train it, learn from
it.
Otherwise, acquiring unlabeled data generally doesn’t require additional resources.
• Reinforcement machine learning algorithms is a learning method that interacts

with its environment by producing actions and discovers errors or rewards. Trial
and error search and delayed reward are the most relevant characteristics of
reinforcement learning. This method allows machines and software agents to
automatically determine the ideal behavior within a specific context in order to
maximize its performance. Simple reward feedback is required for the agent to
learn which action is best; this is known as the reinforcement signal.
2.1 Data Processing
The challenge is related to statistics acquisition, statistics evaluation, statistics

normalization, information transformation, after which facts integration, information
filtering/cutting/discount with which we are able to create a series of different
education units or schooling corpuses to be able to then result in the making (after
proper splitting) of the schooling, validation and check sets. This step is very massive

because the highquality and extent of statistics that you collect will at once decide
how true your predictive model may be. Data instruction, in which we load our data
into a appropriate place and prepare it for use in our machine studying schooling. This
is also an awesome time to do any applicable visualization of your facts, to help you
see if there are any associated relationships among one of a kind variables you may
take gain of, in addition to display you if there are any facts imbalances. We’ll also
need to divide the information in elements. The first element, utilized in training our
model, may be the bulk of the dataset. The 2nd component could be used for
calculating our educated version’s overall performance.
2.2 Training Sets creation
The version is mostly in shape on a schooling dataset that is a set of examples used to
match the parameters of the model. The version is educated at the training dataset
using a supervised mastering method. In exercise, the education dataset frequently
contain pairs of an input vector (or scalar) and the corresponding output vector (or
scalar), that's typically denoted as the target (or label). The present day model is
administered with the training dataset and produces a end result, which is then in
comparison with the target, for each enter vector inside the training dataset. Based on
the outcome of the assessment and the specific getting to know algorithm getting
used, the parameters of the version are adjusted. The version becoming can
encompass both variable selection and parameter estimation.
2.3 Machine Learning Algorithms testing, evaluation and selection
The next step in our workflow is choosing a model. There are several models that
researchers and facts scientists have created over the years. Some are very well
desirable for photo facts, others for sequences (like text, or music), some for
numerical information, and others for textual content-primarily based data. It is most
effective at this point that the statistics scientists will start testing algorithms to build
special fashions, to evaluate them and to satisfactory-music the hyper-parameters so
that the great version(s) are selected. In applied machine learning, person algorithms

need to be swapped in and out relying on which performs quality for the hassle and
the dataset. Therefore, we are able to attention on belief and sensible blessings over
math and principle.
2.4 Deployment and A/B testing
Once schooling is whole, it’s time to look if the model is of any properly, the use of
evaluation. This is wherein that dataset that we set aside earlier comes into play.
Evaluation allows us to check our version in opposition to facts that has never been
used for training. This metric permits us to see how the model may carry out towards
statistics that it has no longer yet visible. This is meant to be representative of how the
model might carry out within the actual global. A good rule of thumb you'll use for a
schooling-assessment split somewhere on the order of 80/20 or 70/30. Much of that is
motivated via the dimensions of the unique supply dataset. If you've got plenty of
facts, perhaps you don’t want as big of a fragment for the assessment dataset.
Figure 2.1: Machine learning workflow
One has no concept what algorithm will work fine on a problem before start. Even
professional data scientists cannot inform you. This hassle isn't restricted to the
selection of system learning algorithms. You cannot recognize what records
transforms and what features within the information that if uncovered could
satisfactory present the shape of the trouble to the algorithms.

One may also have some thoughts. One may additionally have some favored
techniques. But how does one recognize that the techniques that got you consequences
remaining time will get you true results this time? How does one recognize that the
techniques are transferable from one problem to every other? The solution to these
complete questions relies upon on the problem and dataset. The selection of set of
rules differs from one dataset to some other and sort of problem.
CHAPTER III
APPLICATION OF MACHINE LEARNING
3.1 Uses of machine learning
Machine getting to know, reorganized as a separate subject, commenced to flourish

inside the 1990s. The subject changed its goal from attaining synthetic intelligence to
tackling solvable problems of a practical nature. It shifted focus away from the
symbolic tactics it had inherited from AI, and towards strategies and models borrowed
from facts and possibility concept. It also benefited from the growing availability of
digitized statistics, and the ability to distribute it thru the Internet.
Relation to data mining
Machine learning and statistics mining regularly appoint the equal methods and
overlap significantly, however while gadget mastering makes a specialty of
prediction, based totally on recognized houses found out from the education facts,
data mining focuses on the invention of (formerly) unknown properties in the
statistics (that is the evaluation step of expertise discovery in databases). Data mining
makes use of many gadget studying strategies, but with special goals; then again,
gadget gaining knowledge of additionally employs statistics mining methods as

"unsupervised learning" or as a preprocessing step to enhance learner accuracy. Much

of the confusion among these studies groups (which do regularly have separate
meetings and separate journals, ECML PKDD being a main exception) comes from
the fundamental assumptions they work with: in machine mastering, performance is
normally evaluated with recognize to the capacity to reproduce recognized know-how,
whilst in information discovery and information mining (KDD) the important thing
project is the discovery of previously unknown expertise. Evaluated with recognize to
known understanding, an uninformed (unsupervised) technique will effortlessly be
outperformed by other supervised methods, even as in a standard KDD assignment,
supervised techniques cannot be used because of the unavailability of education
records.
Relation to optimization
Machine getting to know also has intimate ties to optimization: many learning issues
are formulated as minimization of some loss characteristic on a training set of
examples. Loss features specific the discrepancy among the predictions of the model
being trained and the actual hassle instances (as an instance, in type, one desires to
assign a label to times, and fashions are trained to properly expect the pre-assigned
labels of a fixed of examples). The distinction among the two fields arises from the
goal of generalization: even as optimization algorithms can limit the loss on a
schooling set, gadget studying is involved with minimizing the loss on unseen
samples.
Relation to statistics
Machine studying and facts are intently associated fields. According to Michael I.
Jordan, the thoughts of gadget getting to know, from methodological ideas to
theoretical equipment, have had a long pre-records in records. He also cautioned the
time period facts science as a placeholder to name the overall area.

3.2 Prominent Sectors the usage of ML
Most industries operating with big volumes of facts have identified the value of
machine studying generation. By accumulating insights from this data often in actual
time companies are able to work extra efficaciously or earn an advantage over
contenders. Most industries working with large amounts of data have recognized the
value of machine learning technology. By gleaning insights from this data – often in
real time – organizations are able to work more efficiently or gain an advantage over
competitors.
Financial services
Banks and different companies in the monetary industry use system gaining
knowledge of era for two key purposes: to understand important insights in records,
and save you fraud. The insights can recognize investment possibilities, or help
buyers realize when to trade. Data mining can also identify clients with high-danger
profiles, or use cyber surveillance to pinpoint caution symptoms of fraud.
Government
Government organizations consisting of public safety and services have a particular

need for system learning considering that they have got several resources of facts that
may be mined for insights. Analyzing sensor records, for example, identifying
approaches to growth efficiency and store money. Machine studying can also assist
spot fraud and minimize identification robbery.
Health care
Machine mastering is a fast-growing fashion within the fitness care enterprise, way to
the arrival of wearable gadgets and sensors which could use records to assess a
patient's health in real time. The generation also can assist medical experts analyze
statistics to pick out trends or pink flags which could result in higher-quality
diagnoses and remedy.

Retail
Websites suggesting objects you might like primarily based on in advance purchases
are using machine mastering to investigate your shopping records. Retailers depend
on gadget mastering to capture statistics, analyze it and use it to personalize a
shopping revel in, implement a advertising marketing campaign, rate optimization,
merchandise supply planning, and for client insights.
Transportation
Analyzing records to recognize styles and traits is prime to the transportation industry,
which is based on making routes greater efficient and predicting capacity issues to
boom profitability. The statistics evaluation and modelling elements of device gaining
knowledge of are essential tools to delivery organizations, public transportation and
other transportation groups.
Marketing and sales
Websites recommending items you might like based on previous purchases are using
machine learning to analyze your buying history – and promote other items you'd be
interested in. This ability to capture data, analyze it and use it to personalize a
shopping experience (or implement a marketing campaign) is the future of retail.

CHAPTER IV
METHODS OF CLASSIFICATION
The problem of information classification has numerous programs in a huge variety of

mining packages. This is because the hassle attempts to research the connection
among a set of feature variables and a target variable of interest. Since many realistic
problems may be expressed as institutions between feature and target variables, this
gives a large range of applicability of this model. The trouble of classification maybe
stated as follows:
Given a fixed of training statistics points at the side of related training labels, decide
the magnificence label for an unlabeled test example.
Numerous versions of this problem may be defined over distinctive settings. Excellent
overviews on statistics classification may be located in [39, 50, 63, and 85].
Classification algorithms usually contain two phases:
• Training Phase: In this phase, a version is made out of the education instances.
• Testing Phase: In this phase, the version is used to assign a label to an unlabeled
check instance.

In a few cases, consisting of lazy getting to know, the training segment is neglected
absolutely, and the classification is achieved immediately from the relationship of the
education times to the check example. Instance-primarily based methods including the
nearest neighbor classifiers are examples of one of these scenario. Even in such cases,
a pre-processing segment along with a nearest neighbor index construction may be
completed so as to ensure efficiency at some stage in the checking out phase. The
output of a classification set of rules can be provided for a test example in certainly
one of approaches:
1. Discrete Label: In this situation, a label is back for the take a look at example.
2. Numerical Score: In this example, a numerical score is returned for each

magnificence label and test instance aggregate.
Note that the numerical rating can be transformed to a discrete label for a check
example, by way of choosing the magnificence with the highest rating for that test
instance. The advantage of a numerical rating is that it now becomes possible to
evaluate the relative propensity of different test instances to belong to a selected
elegance of importance, and rank them if wished. Such methods are used regularly in
uncommon class detection issues, in which the original class distribution is
particularly imbalanced, and the invention of a few instructions is more valuable than
others. The classification hassle hence segments the unseen test times into businesses,
as defined by the class label. While the segmentation of examples into businesses is
also completed by way of clustering, there's a key difference between the two
problems. In the case of clustering, the segmentation is completed the use of
similarities among the function variables, without a prior knowledge of the structure
of the companies. In the case of classification, the segmentation is completed on the
basis of a schooling data set, which encodes information approximately the structure
of the corporations within the shape of a target variable. Thus, at the same time as the
segmentations of the records are usually associated with notions of similarity, as in
clustering, significant deviations from the similarity-based segmentation can be
completed in sensible settings. As a result, the classification hassle is called
supervised getting to know, simply as clustering is known as unsupervised studying.
The supervision procedure of ten provides

significant application specific software, due to the fact the class labels may
additionally constitute essential residences of interest. Some common utility domain
names in which the classification trouble arises, are as follows:
• Customer Target Marketing: Since the classification hassle relates

characteristic variables to target classes, this method is extraordinarily famous for the
hassle of client goal marketing.
In such cases, feature variables describing the patron can be used to expect their
buying pastimes on the basis of previous schooling examples. The goal variable may
additionally encode the buying interest of the customer.
• Medical Disease Diagnosis: In recent years, the use of facts mining techniques
in clinical era has won increasing traction. The capabilities can be extracted from the
medical records, and the magnificence labels correspond to whether or not or now not
an affected person may additionally select up a disorder within the future. In those
cases, its miles appropriate to make sickness predictions with using such facts.
• Supervised Event Detection: In many temporal scenarios, elegance labels can

be associated with time stamps similar to unusual occasions. For example, an
intrusion pastime can be represented as a category label. In such cases, time-
collection classification techniques can be very beneficial.
• Multimedia Data Analysis: It is regularly suitable to carry out classification of

large volumes of multimedia facts which includes images, movies, audio or other
extra complicated multimedia statistics. Multimedia statistics evaluation can
frequently be difficult, because of the complexity of the underlying characteristic
space and the semantic hole among the characteristic values and corresponding
inferences.
• Biological Data Analysis: Biological information is regularly represented as

discrete sequences, in which its miles appropriate to expect the homes of unique
sequences. In a few cases, the biological information is likewise expressed in the form
of networks.
Therefore, classification techniques may be implemented in a diffusion of different
methods in this state of affairs.

• Document Categorization and Filtering: Many packages, which include

newswire offerings, require the classification of huge numbers of documents in actual
time. This software is called file categorization, and is a crucial location of research in
its personal right.
• Social Network Analysis: Many forms of social network evaluation, inclusive

of collective classification, companion labels with the underlying nodes. These are
then used a good way to predict the labels of different nodes. Such packages are very
useful for predicting useful properties of actors in a social community. The variety of
issues that can be addressed by using classification algorithms is significant, and
covers many domains.
• Technique-targeted: The trouble of records classification can be solved using

several lessons of techniques consisting of decision trees, rule-primarily based
techniques, neural networks, SVM techniques, nearest neighbor strategies, and
probabilistic methods.
• Data-Type Centered: Many specific data kinds are created by way of

exceptional programs. Some examples of different statistics kinds encompass textual
content, multimedia, uncertain facts, time collection, discrete sequence, and
community facts. Each of those extraordinary statistics sorts requires the design of
various techniques, every of which may be pretty one-of-a-kind.
• Variations on Classification Analysis: Numerous versions on the same old

classification problem exist, which cope with greater challenging scenarios inclusive
of uncommon elegance learning, switch studying, semi-supervised learning, or active
studying. Alternatively, distinct versions of classification, which includes ensemble
analysis, may be used as a way to improve the effectiveness of classification
algorithms. These troubles are of course intently related to issues of model
assessment. All those problems could be discussed significantly on this e book.

4.1 Common Techniques in Data Classification
Feature Selection Methods
The first phase of surely all classification algorithms is that of function choice. In
most facts mining eventualities, a huge variety of functions are amassed by way of
folks who are frequently now not domain specialists. Clearly, the irrelevant
capabilities can also frequently bring about negative modeling, considering the fact
that they are not properly related to the class label. In reality, such features will
typically get worse the classification accuracy because of over fitting, while the
education facts set is small and such capabilities are allowed to be part of the
schooling version. For instance, do not forget a clinical example where the features
from the blood work of different patients are used to predict a specific ailment.
Clearly, a feature including the Cholesterol degree is predictive of coronary heart
sickness, while a feature1 which include PSA stage is not predictive of coronary heart
disease. However, if a small training statistics set is used, the PSA degree may
additionally have freak correlations with coronary heart sickness because of random
variations. While the impact of a single variable may be small, the cumulative impact
of many inappropriate function experiment be significant. This will bring about a
training version that generalizes poorly to unseen check times.
Therefore, it is important to use the ideal functions all through the education system.
There are broad kinds of function choice techniques:

Figure 4.1: Feature selection process
1. Filter Models: In these instances, a crisp criterion on a single characteristic, or

a subset of capabilities, is used to evaluate their suitability for classification. This
method is impartial of the specific algorithm getting used.
2. Wrapper Models: In those cases, the feature choice method is embedded right
into a classification algorithm, with a view to make the feature choice technique
touchy to the classification set of rules. This method acknowledges the truth that
specific algorithms may go higher with distinct features.
In order to perform function choice with filter models, some of exclusive measures are
used so that it will quantify the relevance of a characteristic to the classification
method. Typically, those measures compute the imbalance of the feature values over
distinctive stages of the characteristic, which may either be discrete or numerical.
Some examples are as follows:
Probabilistic methods
Probabilistic methods are the most essential amongst all records classification
methods.
Probabilistic classification algorithms use statistical inference to find the high-quality
magnificence for a given example. In addition to definitely assigning the great class
like other classification algorithms, probabilistic classification algorithms will output
a corresponding posterior probability of the test instance being a member of each of
the viable lessons. The posterior possibility is defined because the chance after
observing the specific characteristics of the check example. On the other hand, the
previous probability is really the fraction of schooling facts belonging to every precise
elegance, with no knowledge of the test example. After acquiring the posterior
chances, we use decision theory to determine elegance club for each new instance.
Basically, there are methods in which we can estimate the posterior possibilities. In
the first case, the posterior possibility of a selected elegance is expected by figuring
out the classconditional possibility and the prior elegance one by one and then
applying Bayes’ theorem to find the parameters. The maximum widely known among
those is the Bayes classifier, which is known as age native model.

CHAPTER V
CLASSIFIERS AND MODELS
In System getting to know and statistics, classification is a supervised learning method

wherein the computer program learns from the information enter given to it and then
makes use of this studying to categories new observation. This information set may
additionally genuinely be bi-magnificence (like figuring out whether the man or
woman is male or woman or that the mail is junk mail or non-unsolicited mail) or it
can be multi-elegance too. Some examples of category problems are: speech
reputation, handwriting reputation, and bio metric identity, file category and so on.

Here we have the types of classification algorithms in Machine Learning:

1. Here we have the linear Classifiers: Logistic Regression, Naive Bayes Classifier
2. Support Vector Machines
3. Decision Trees
4. Boosted Trees
5. Random Forest
6. Neural Networks
7. nearest Neighbor
5.1 Decision Tree based methods
The fundamental gaining knowledge of technique is to recursively divide the

schooling statistics into buckets of homogeneous participants thru the maximum
discriminative dividing criteria. The size of "homogeneity" is primarily based at the
output label; whilst it's far a numeric cost, the measurement can be the variance of the
bucket; when it is a class, the size will be the entropy or index of the bucket. During
the learning, diverse dividing standards based at the center can be tried (the usage of
in a greedy manner); when the input is a category (Mon, Tue, Wed ...), it'll first be
was binary (isMon, isTue, isWed ...) after which use the genuine/false as a selection
boundary to assess the homogeneity; whilst the input is a numeric or ordinal fee, the
less Than, greater Than at every schooling information input fee will be used as the
decision boundary. The training method stops whilst there is no sizable gain in
homogeneity with the aid of similarly cut up the Tree. The participants of the bucket
represented at leaf node will vote for the prediction; majority wins whilst the output is
a category and member common whilst the output is a numeric.

Figure 5.1: Decision tree
The exact a part of Tree is that its miles very bendy in terms of the statistics sort of
enter and output variables which may be categorical, binary and numeric price. The
stage of selection nodes additionally indicate the degree of influences of various input
variables. The predicament is every decision boundary at each cut up factor is a
concrete binary selection. Also the choice standards only keep in mind one input
attributes at a time but not a aggregate of multiple input variables. Another weak spot
of Tree is that once learned it cannot be up to date incrementally. When new training
facts arrives, you need to throw away the old tree and retrain every information from
scratch.
However, Tree when mixed with Ensemble strategies (Random Forest, Boosting
Trees) addresses plenty of the limitations referred to above. For example, Gradient
Boosting Decision Tree continually beat the overall performance of other ML models
in many troubles and is one of the maximum famous approach in recent times.

5.2 Linear regression based methods
The basic assumption is that the output variable (a numeric cost) can be expressed as a
linear aggregate (weighted sum) of a fixed of enter variable (which is also numeric
cost). y = w1x1 + w2x2 + w3x3....
The complete objective of the education section is to learn the weights w1, w2 ... By
using minimizing the mistake feature lost(y, w1x1 + w2x2 + ...). Gradient descent is
the classical approach of solving this trouble with the general idea of adjusting w1, w2
... Alongside the course of the most gradient of the loss feature.
The input variable is required to be numeric. For binary variable, this could be
represented as 0, 1. For specific variable, every feasible price can be represented as a
separate binary variable (and consequently 0, 1). For the output, if it is a binary
variable (0, 1) then a log it characteristic is used to transform the range of -infinity to
+infinity into 0 to at least one. This is called logistic regression and a exceptional loss
characteristic (primarily based on most likelihood) is used.
To keep away from over fitting, regularization method (L1 and L2) is used to penalize
big value of w1, w2 ... L1 is with the aid of adding absolutely the fee of w1 into the
loss characteristic whilst L2 is through including the square of w1 into the loss
feature. L1 has the belongings that it'll penalize redundant capabilities or beside the
point characteristic greater (with very small weight) and is a good tool to pick out
especially influential capabilities.
The power of linear model is that it has very excessive performance in each scoring
and mastering. The Stochastic gradient descent-primarily based learning algorithm is
notably scalable and might handle incremental mastering.

Figure 5.2: Linear Regression
The weak point of linear version is linear assumption of input functions, which is
frequently fake. Therefore, an essential function engineering attempt is required to
convert every input function, which normally worried domain expert. Another
commonplace manner is to throw special transformation functions 1/x, x^2, log(x)
within the wish that considered one of them will have a linear relationship with the
output. Linearity may be checked with the aid of observing whether the residual (y -
predicted) is typically allotted or now not (the use of the Plot with the Gaussian
distribution).
5.3 Neural Network
Neural Network can be considered as multiple layer of perceptron (each is a logistic

regression unit with multiple binary input and one binary output). By having multiple
layers, this is equivalent to: z = logit (v1.y1 + v2y2 + ...), while y1 = logit (w11x1 +
w12x2 + ...). This multi-layer model enables Neural Network to learn non-linear
relationship between input x and output z. The typical learning technique is
"backward error propagation" where the error is propagate from the output layer back
to the input layer to adjust the weight.

Figure 5.3: Neural Network
Notice that Neural Network expect binary input which means we need to transform
categorical input into multiple binary variable. For numeric input variable, we can
transform that into binary encoded 101010 string. Categorical and numeric output can
be transformed in a similar way.
5.4 Bayesian Network:
It is essentially a dependency graph where each node represents a binary variable and
every part (directional) represents the dependency courting. If Node A and Node B
has an part to Node C. This manner the in all likelihood of C is genuine relies upon on
specific combinations of the Boolean fee of A and B. Node C can factor to Node D
and now Node D relies upon on Node A and Node B as well.
The mastering is about locating at each node be part of-chance distribution of all
incoming edges. This is accomplished via counting the determined values of A, B and
C after which update the joint probability distribution table at Node C. Once we've the
opportunity distribution table at every node, then we can compute the probability of
any hidden node (output variable) from the determined nodes (enter variables) with
the aid of the usage of the Bayes rule.

Figure 5.4: Bayesian Network
The power of Bayesian community is it is especially scalable and may examine

incrementally because all we do is to be counted the determined variables and replace
the probability distribution table. Similar to Neural Network, Bayesian community
expects all records to be binary, categorical variable will want to be transformed into
more than one binary variable as described above. Numeric variable is normally not a
terrific fit for Bayesian community.
5.5 Support Vector Machine
Support Vector Machine takes numeric input and binary output. It is based on finding
a linear plane with maximum margin to separate two class of output. Categorical input
can be turned into numeric input as before and categorical output can be modeled as
multiple binary output.
With a different lost function, SVM can also do regression (called SVR). I haven't
used this myself so I can't talk much. The strength of SVM is it can handle large
number of dimensions. With the kernel function, it can handle non-linear relationship
as well.

5.6 Nearest Neighbor
We are not mastering a model in any respect. The concept is to find K similar
statistics point from the training set and use them to interpolate the output cost, which
is either most of the people value for express output, or average (or weighted
common) for numeric output. K is a tunable parameter which desires to be move-
tested to choose the first-class price.
Figure 5.5: Nearest Neighbor
Nearest Neighbor require the definition of a distance function that's used to discover
the nearest neighbor. For numeric input, the commonplace practice is to normalize
them by way of minus the mean and divided through the usual deviation. Euclidean
distance is typically used while the input are impartial, otherwise distance (which
account for correlation among pairs of input functions) have to be used rather. For
binary attributes, Jacquard distance can be used. The power of K nearest neighbor is
its simplicity as no version desires to be taught. Incremental getting to know is
automatic when more facts arrives (and vintage records can be deleted as properly).
Data, but, wishes to be prepared in a distance-conscious tree such that locating the

nearest neighbor is O (log N) in preference to O (N). On the other hand, the weak spot
of KNN is it doesn't deal with excessive number of dimensions.
CHAPTER VI
DEVELOP A MACHINE LEARNING MODEL
6.1 Definition of the problem
Problem is very clear we have various demands for the best possible solution to
implement biometric traits. We are just implementing a fix algorithm for all possible
problems without comparing and knowing the accuracy of a fixed algorithm for that
particular problem. A fixed algorithm cannot give the best possible result and
accuracy for every problem. We should compare and have the proper knowledge of
the problem and its best possible solution. In our day to day life we come across
various type of technologies from identifying a person from their fingerprint to their
eye retina. And with growing technology the need of correct and efficient algorithm is
faced. And that’s how we got the idea of studying different algorithms and getting the
best out of it for different uses.
6.2 Face recognition
6.2.1 Facial recognition system
A facial recognition system is a technology capable of identifying or verifying a

person from a digital image or a video frame from a video source. There are
multiple methods in which facial recognition systems work, but in general, they
work by comparing selected facial features from given image with faces within a
database. It is also described as a Biometric Artificial Intelligence based
application that can uniquely identify a person by analysing patterns based on the
person's facial textures and shape. While initially a form of computer application,
it has seen wider uses in recent times on mobile platforms and in other forms of
technology, such as robotics. It is typically used as access control in security
systems and can be compared to other biometrics such as fingerprint or eye iris
recognition systems. Although the accuracy of facial recognition system as a
biometric technology is lower than iris recognition and fingerprint recognition, it

is widely adopted due to its contactless and non-invasive process. Recently, it has
also become popular as a commercial identification and marketing tool. Other
applications include advanced human-computer interaction, video surveillance,
automatic indexing of images, and video database, among others.
6.2.2 The History of Face Recognition
Face recognition began as early as 1977 with the first automated system being
introduced By Kanade using a feature vector of human faces. In 1983, Sirovich
and Kirby introduced the principal component analysis (PCA) for feature
extraction. Using PCA, Turk and Pentland Eigenface was developed in 1991
and is considered a major milestone in technology [3]. Local binary pattern
analysis for texturerecognition was introduced in 1994 and is improved upon
for facial recognition later by incorporating Histograms (LBPH). In 1996
Fisherface was developed using Linear discriminant analysis (LDA) for
dimensional reduction and can identify faces in different illumination
conditions, which was an issue
in Eigenface method. Viola and Jones introduced a face detection technique
using HAAR cascades and ADABoost. In 2007, A face recognition technique
was developed by Naruniec and Skarbek using Gabor Jets that are similar to
mammalian eyes. In This project, HAAR cascades are used for face detection
and Eigenface, Fisherface are used for face recognition.
6.2.3 Current Scenario
Users have very little knowledge about the technologies and algorithms
available for them in the machine learning/deep learning. Moreover if users are
aware about any of the algorithm then its proper usage techniques need also be
clarified. If the model is trained with the incorrect data set or the wrong model
is used for the different purpose then it will never give the proper and required
result to the user. The users may use the algorithms which may under-fit or
over-fit the model, hence consuming more time and power with no guarantee of
the proper and required result.

6.2.4 Proposed System
Our system will help the users to first understand the available algorithms for
their requirements. Users do not have to get panic for the successful
implantation of their model within give time. According to the users need the
correct data set will be collected and if users have their own data set to work on
then the users will be provided with best algorithm to work on their data set.
The model will be trained with the best possible algorithm and the cases of data
over-fitting and under-fitting will be avoided with using the correct data set
with the correct algorithm.
6.2.5 Features of Proposed System
Geometry-based Methods uses the specialised edge and contour detectors to

find the location of a set of facial land-marks and to measure relative positions
and distances between them. The geometry based methods are faster and needs
less memory. Geometrical relationship between facial landmarks, or in other
words the spatial configuration of facial features. That means that the main
geometrical features of the face such as the eyes, nose and mouth are first
located and then faces are classified on the basis of various geometrical
distances and angles between features.
We are providing the users with 3 different algorithms namely Support vector
machine (SVM), Principle component analysis (PCA), Linear Discriminant
Analysis (LDA). They all have their own accuracy and ways to train and predict
the model. We have used Eigen face, Fisher face hence providing the fast and
accurate model with less consumption of memory.
6.3 Project Scope

A complete Study of the different Geometric based algorithms to cope up the

current Industry 4.0. After getting the results user will have the choice to pick the
best algorithm for their project instead of blindly working on any algorithm
without
knowing its accuracy and precision. The model is trained and compared with high
precision using different geometric based algorithms. The user can compare the
accuracy on their own dataset or already available data set. It is easy to compare
for the user and it will save them time and money as well as user can work on
other
required things instead of working and testing all algorithms.
6.4 Future Scope
At the initial stage the project is comparing and showing the results based on face
recognition system only. As a future scope for the project the project can be
implemented for different biometrics like finger print or eye retina. We do not
have to start the project from scratch to implement and train models for other
biometrics, instead small changes and retrain of the model with the new data set
according to the users requirement will give the user a clear idea about the model.
6.5 Organization of the Project
The purpose of this report is to inform the reader that it is helpful to both, the reader
and the writer. Report being the logically organised will be able to deliver the whole
content along with the concept and purpose of the project. Its aim is to deliver or
inform the features, agenda and vision along with the flow of project successfully. It
has been tried to explain briefly, containing enough information to distinguish it from
the other and existing projects.

CHAPTER VII
FACE DETECTION & FACE RECOGNITION
7.1 Introduction
The following document is a report on the mini project for Robotic visual perception
and autonomy. It involved building a system for face detection and face recognition
using several classifiers available in the open computer vision library(OpenCV).
Face recognition is a non-invasive identification system and faster than other
systems since multiple faces can be analysed at the same time. The difference
between face detection and identification is, face detection is to identify a face from
an image and locate the face. Face recognition is making the decision” whose face is
it?”, using an image database. In this project both are accomplished using different
techniques and are described below. The report begins with a brief history of face
recognition. This is followed by the explanation of HAAR-cascades, Eigen face,
Fisher face and Local binary pattern histogram (LBPH) algorithms. Next, the
methodology and the results of the project are described. A discussion regarding the
challenges and the resolutions are described. Finally, a conclusion is provided on the
pros and cons of each algorithm and possible implementations.
7.2 Face Detection using Haar-Cascades
A Haar wavelet is a mathematical fiction that produces square-shaped waves with a

beginning and an end and used to create box shaped patterns to recognise signals

with sudden transformations. An example is shown in figure 1. By combining

several wavelets, a cascade can be created that can identify edges, lines and circles
with different colour intensities. These sets are used in Viola Jones face detection
technique in 2001 and since then more patterns are introduced [10] for object
detection as shown in figure 1.
To analyse an image using Haar cascades, a scale is selected smaller than the target
image. It is then placed on the image, and the average of the values of pixels in each
section is taken. If the difference between two values pass a given threshold, it is
considered a match. Face detection on a human face is performed by matching a
combination of different Haar-like-features. For example, forehead, eyebrows and
eyes contrast as well as the nose with eyes as shown below in figure A single
classifier is not accurate enough. Several classifiers are combined as to provide an
accurate face detection system as shown in the block diagram below in figure 3.
Figure 1: A Haar wavelet and resulting Haar-like features.
in this project, a similar method is used effectively to by identifying faces

and eyes in combination resulting better face detection. Similarly, in viola Jones
method [7], several classifies were combined to create stronger classifiers. ADA
boost is a machine learning algorithm that tests out several week classifiers on a
selected location and choose the most suitable [7]. It can also reverse the
direction of the classifier and get better results if necessary [7]. 2. Furthermore,
Weight-update-steps can be updated

Figure 2: Several Haar-like-features matched to the features of authors face.
only on misses to get better performance. The cascade is scaled by 1.25 and re-
iterated in order to find different sized faces. Running the cascade on an image using
conventional loops takes a large amount of computing power and time. Viola Jones
[7] used a summed area table (an integral image) to compute the matches fast. First
developed in 1984 [11], it became popular after 2001 when Viola Jones
implemented Haar-cascades for face detection. Using an integral image enables
matching features with a single pass over the image.
Figure 3: Haar-cascade flow chart
7.3Face Recognition
The following sections describe the face recognition algorithms Eigenface,

Fisherface, Local binary pattern histogram and how they are implemented in
OpenCV.

7.3.1 Eigen face
Eigen face is based on PCA that classify images to extract features using a set of
images. It is important that the images are in the same lighting condition and the
eyes match in each image. Also, images used in this method must contain the same
number of pixels and in grayscale. For this example, consider an image with n x n
pixels as shown in figure 4. Each raw is concatenated to create a vector, resulting a
1×n2 matrix. All the images in the dataset are stored in a single matrix resulting a
matrix with columns corresponding the number of images. The matrix is averaged
(normalised) to get an average human face. By subtracting the average face from
each image vector unique features to each face are computed. In the resulting
matrix, each column is a representation of the difference each face has to the
average human face. A simplified illustration can be seen in figure 4.
Figure 4: Pixels of the image are reordered to perform calculations for

Eigenface
The next step is computing the covariance matrix from the result. To obtain the
Eigen vectors from the data, Eigen analysis is performed using principal component
analysis. From the result, where covariance matrix is diagonal, where it has the

highest variance is considered the 1st Eigen vector. 2nd Eigen vector is the direction
of the next highest variance, and it is in 90 degrees to the 1 st vector. 3rd will be the
next highest variation, and so on. Each column is considered an image and
visualised, resembles a face and called Eigenfaces. When a face is required to be
recognised, the image is imported, resized to match the same dimensions of the test
data as mentioned above. By projecting extracted features on to each of the
Eigenfaces, weights can be calculated. These weights correspond to the similarity of
the features extracted from the different image sets in the dataset to the features
extracted from the input image. The input image can be identified as a face by
comparing with the whole dataset. By comparing with each subset, the image can be
identified as to which person it belongs to. By applying a threshold detection and
identification can be controlled to eliminate false detection and recognition. PCA is
sensitive to large numbers and assumes that the subspace is linear. If the same face
is analysed under different lighting conditions, it will mix the values when
distribution is calculated and cannot be effectively classified. This makes to
different lighting conditions poses a problem in matching the features as they can
change dramatically.
7.3.2 Fisherface
Fisherface technique builds upon the Eigenface and is based on LDA derived from
Ronald Fishers’ linear discriminant technique used for pattern recognition.
However, it uses labels for classes as well as data point information [6]. When
reducing dimensions, PCA looks at the greatest variance, while LDA, using labels,
looks at an interesting dimension such that, when you project to that dimension you
maximise the difference between the mean of the classes normalised by their
variance [6]. LDA maximises the ratio of the between-class scatter and within-class
scatter matrices. Due to this, different lighting conditions in images has a limited
effect on the classification process using LDA technique. Eigenface maximises the
variations while Fisherface maximises the mean distance between and different
classes and minimises variation within classes. This enables LDA to differentiate
between feature classes better than PCA and can be observed in figure 5 [12].
Furthermore, it takes less amount of space and is the fastest algorithm in this

project. Because of these PCA is more suitable for representation of a set of data
while LDA is suitable for classification.
Figure 5: The first component of PCA and LDA. Classes in PCA looks more
mixed than of LDA
7.3.3 Local Binary Pattern Histogram
Local binary patterns were proposed as classifiers in computer vision and in 1990
By Li Wang [4]. The combination of LBP with histogram oriented gradients was
introduced in 2009 that increased its performance in certain datasets [5]. For feature
encoding, the image is divided into cells (4 x 4 pixels). Using a clockwise or
counter-clockwise direction surrounding pixel values are compared with the central
as shown in figure 6. The value of intensity or luminosity of each neighbour is
compared with the centre pixel. Depending if the difference is higher or lower than
0, a 1 or a 0 is assigned to the location. The result provides an 8-bit value to the cell.
The advantage of this technique is even if the luminosity of the image
Figure 6: Local binary pattern histogram generating 8-bit number
is changed as in figure 7, the result is the same as before. Histograms are used in
larger cells to find the frequency of occurrences of values making process faster. By

analysing the results in the cell, edges can be detected as the values change. By
computing the values of all cells and concatenating the histograms, feature vectors
can be obtained. Images can be classified by processing with an ID attached. Input
images are classified using the same process and compared with the dataset and
distance is obtained. By setting up a threshold, it can be identified if it is a known or
unknown face. Eigenface and Fisherface compute the dominant features of the
whole training set while LBPH analyse them individually.
Figure 7: The results are same even if brightness is changed
7.4 Methodology
Below are the methodology and descriptions of the applications used for data
gathering, face detection, training and face recognition. The project was coded in
Python using a mixture of IDLE and PYCharm IDEs.
7.4.1Face Detection
First stage was creating a face detection system using Haar-cascades. Although,
training is required for creating new Haar-cascades, OpenCV has a robust set of
Haar-cascades that was used for the project. Using face-cascades alone caused
random objects to be identified and eye cascades were incorporated to obtain stable

face detection. The flowchart of the detection system can be seen in figure 8. Face
and eye
Figure 8: The Flow chart of the face detection application
classifier objects are created using classifier class in OpenCV through the
cv2.CascadeClassifier() and loading the respective XML files. A camera object is
created using the cv2.VideoCapture() to capture images. By using the
CascadeClassifier.detectMultiScale() object of various sizes are matched and
location is returned. Using the location data, the face is cropped for further
verification. Eye cascade is used to verify there are two eyes in the cropped face. If
satisfied a marker is placed around the face to illustrate a face is detected in the
location.
7.4.2 Face Recognition Process
For this project three algorithms are implemented independently. These are
Eigenface, Fisherface and Linear binary pattern histograms respectively. All three
can be implemented using OpenCV libraries. There are three stages for the face
recognition as follows:
1. Collecting images IDs

2. Extracting unique features, classifying them and storing in XML files
3. Matching features of an input image to the features in the saved XML files
and predict identity
Collecting the image data

Collecting classification images is usually done manually using a photo editing
software to crop and resize photos. Furthermore, PCA and LDA requires the same
number of pixels in all the images for the correct operation. This time consuming
and a laborious task is automated through an application to collect 50 images with
different expressions. The application detects suitable expressions between 300ms,
straightens any existing tilt and save them. The Flow chart for the application is
shown in figure 9.

Figure 9: The Flowchart for the image collection
Application starts with a request for a name to be entered to be stored with the ID in
a text file. The face detection system starts the first half. However, before the
capturing begins, the application check for the brightness levels and will capture
only if the face is well illuminated. Furthermore, after the face is detected, the
position of the eyes are analysed. If the head is tilted, the application automatically
corrects the orientation. These two additions were made considering the
requirements for Eigenface algorithm. The Image is then cropped and saved using
the ID as a filename to be identified later. A loop runs this program until 50 viable
images are collected from the person. This application made data collection
efficient.
Training the Classifiers

OpenCV enables the creation of XML files to store features extracted from datasets
using the FaceRecognizer class. The stored images are imported, converted to
grayscale and saved with IDs in two lists with same indexes. FaceRecognizer
objects are created using face recogniser class. Each recogniser can take in
parameters that are described below:

cv2.face. createEigenFaceRecognizer ()
1. Takes in the number of components for the PCA for crating Eigenfaces.
OpenCV documentationmentions 80 can provide satisfactory reconstruction
capabilities.
2. Takes in the threshold in recognising faces. If the distance to the likeliest
Eigenface is above thisthreshold, the function will return a -1, that can be
used state the face is unrecognisable cv2.face.
createFisherfaceRecognizer ()
1. The first argument is the number of components for the LDA for the creation
of Fisherfaces.OpenCV mentions it to be kept 0 if uncertain.
2. Similar to Eigenface threshold. -1 if the threshold is passed.
cv2.face. createLBPHFaceRecognizer ()
1. The radius from the centre pixel to build the local binary pattern.
2. The Number of sample points to build the pattern. Having a considerable
number will slow downthe computer.
3. The Number of Cells to be created in X axis.
4. The number of cells to be created in Y axis.
5. A threshold value similar to Eigenface and Fisherface. if the threshold is
passed the object willreturn -1
Recogniser objects are created and images are imported, resized, converted into
numpy arrays and stored in a vector. The ID of the image is gathered from splitting
the file name, and stored in another vector. By using
FaceRecognizer.train(NumpyImage, ID) all three of the objects are trained. It
must be noted that resizing the images were required only for Eigenface and
Fisherface, not for LBPH. Next, the configuration model is saved as a XML file
using FaceRecognizer.save(FileName). In this project, all three are trained and
saved through one application for convenience. The flow chart for the trainer is
shown in figure 10.

Figure 10: Flowchart of the training application
The Face Recognition
Face recogniser object is created using the desired parameters. Face detector is used
to detect faces in the image, cropped and transferred to be recognised. This is done
using the same technique used for the image capture application. For each face
detected, a prediction is made using FaceRecognizer.predict() which return the
ID of the class and confidence. The process is same for all algorithms and if the
confidence his higher than the set threshold, ID is -1. Finally, names from the text
file with IDs are used to display the name and confidence on the screen. If the ID is
-1, the application will print unknown face without the confidence level. The flow
chart for the application is shown in figure 11.
Figure 11: Flowchart of the face recognition application

CHAPTER VII
CONCLUSION
In the close to destiny, the sector is set to witness brilliant increase in Smart Apps,
Virtual Assistants, and significant use of Artificial Intelligence. The cellular
marketplace will expand via using device gaining knowledge of, and we are able to
quickly enter the generation of self-driven vehicles (they've already been launched for
testing and trials). Machine Learning is already an exceedingly powerful tool which
has been solving complicated problems. Although new Machine Learning gear would
pop up now and then, the talents required to music them and jazz those up could for
all time be in call for. Regarding activity possibilities Machine Learning has a
widespread role to play, there may be no element of lifestyles in which Machine
Learning has now not left its mark. As the quantity of facts proliferates, the need for
Engineers and Scientists has increased and will keep growing. In order to apprehend
and control the subtleties and pitfalls of Machine Learning, personnel might be
required because what appears as nicely tuned, simple machine is capable of leading
you off target out of your desired effects. Companies and industries heavily rely upon
Machine Learning, and so you have an incredible possibility in the field. The demand
for Machine Learning Engineers would keep growing, and you may get in at the
movement. The group including Google, Quora, and Facebook lease those who
recognize gadget studying. There is intensive research taking place in machine getting
to know within the top universities of the world. There isn't any higher limit to the
earnings of machine gaining knowledge of experts within the pinnacle businesses.
Machine Learning is a harbinger of capacity increase of people and financial system.
So a ways now we have simply removed the veneer from the floor. There is a whole
lot greater that Machine Learning has yet to achieve and introduce. There is hardly
ever any application for which Machine Learning can't be used for detection and
prediction. Despite the contradictions in perspectives, its miles assured that during
destiny, the space between demand-supply in Data Science and Machine Learning
Skills may want to only be bridged by way of imparting the body of workers that may
deal with Machine Learning’s intricacies, given the advantages of Machine Learning.

The agencies would plunge into faucet algorithm fashions that could enhance and
enhance their operations and consumer-going through features. The algorithms would
take the commercial enterprise to whole new tiers. We have already seen how era has
replaced human beings in financial market and lots of different areas for the better,
starting off the load of bulky and exertions-intensive paintings from human shoulders.
It, therefore, wouldn’t be wrong to mention that Machine Learning has a shiny destiny
in advance that would assist human beings enter a new modified era. Over time
greater dramatic evolution could convey in extra highquality changes.

REFERENCES
1. Jayesh Bapu Ahire : August 24, 2018 : The artificial Neural Networks
https://www.datasciencecentral.com/profiles/blogs/the-artificial-neural-networks
2. Stephen DeAngelis : November 04, 2014 : Machine learning
https://www.enterrasolutions.com/blog/machine-learning-short-primer/
http://fgiasson.com/blog/index.php/2017/03/10/a-machine-learning-workflow/ 3. Mariane
Davids : Jan 9, 2017 7:00:00 AM: Applications of machine learning
https://blog.robotiq.com/5-applications-of-machine-learning
4. Bernard Marr Feb 19, 2016: A short history of machine Learning
https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine- learning-every-
manager-should-read/#74ed2c0815e7
5. Narayanan, Arvind (August 24l, 2016). "Language necessarily contains human biases, and so
will machines trained on language corpora". Freedom to Tinker.

Minor Report

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Minor Report

Uploaded by

Copyright:

Available Formats

A

“BIOMETRIC TRAITS EVALUATION:

Submitted in partial fulfilment for the award of degree of

DEPARTMENT OF COMPUTER SCIENCE

A special thanks goes out to the HEAD OF DEPARTMENT (Computer Science

LIST OF TABLES viii

CHAPTER 6: MODELS OF MACHINE LEARNING 27

Fig No. Name of Figures Page No.

1 2.1: Machine learning workflow 9

Machine learning is a method of data analysis that automates analytical model

Machine gaining knowledge of is a subfield of artificial intelligence (AI). The goal of

Session 2015-2019 Page 1

machine gaining knowledge of is a field within laptop science, it varies from

1.1 Differences between Data Mining, Machine Learning and Deep

Data mining can be considered a superset of many different methods to extract

Session 2015-2019 Page 2

1.2 Need of Machine Learning

Machine Learning is a department which is raised out of Artificial Intelligence (AI).

Session 2015-2019 Page 3

1.3 Goals of Machine Learning

• To explore new gaining knowledge of methods and increase preferred learning

1.4 Why the goals of ML are important or desirable

Through Virtual Assistant answers, machine getting to know automates

Session 2015-2019 Page 4

1.5 Uses of Machine Learning

Conventionally, information analysis became usually being characterized by way of

Session 2015-2019 Page 5

MACHINE LEARNING WORKFLOW

Machine learning algorithms are often categorized as supervised or unsupervised.

Some machine learning methods

Machine learning algorithms are often categorized as supervised or unsupervised.

Session 2015-2019 Page 6

• In contrast, unsupervised machine learning algorithms are used when the

• Semi-supervised machine learning algorithms fall somewhere in between

• Reinforcement machine learning algorithms is a learning method that interacts

2.1 Data Processing

The challenge is related to statistics acquisition, statistics evaluation, statistics

Session 2015-2019 Page 7

2.2 Training Sets creation

2.3 Machine Learning Algorithms testing, evaluation and selection

Session 2015-2019 Page 8

2.4 Deployment and A/B testing

Figure 2.1: Machine learning workflow

Session 2015-2019 Page 9

APPLICATION OF MACHINE LEARNING

3.1 Uses of machine learning

Machine getting to know, reorganized as a separate subject, commenced to flourish

Relation to data mining

Session 2015-2019 Page 10

"unsupervised learning" or as a preprocessing step to enhance learner accuracy. Much

Session 2015-2019 Page 11

3.2 Prominent Sectors the usage of ML

Government organizations consisting of public safety and services have a particular

Session 2015-2019 Page 12

Marketing and sales

Session 2015-2019 Page 13

The problem of information classification has numerous programs in a huge variety of

Session 2015-2019 Page 14

2. Numerical Score: In this example, a numerical score is returned for each

Session 2015-2019 Page 15

• Customer Target Marketing: Since the classification hassle relates

• Supervised Event Detection: In many temporal scenarios, elegance labels can

• Multimedia Data Analysis: It is regularly suitable to carry out classification of