Minor Report

A
Seminar Report
On
“BIOMETRIC TRAITS EVALUATION:

MACHINE LEARNING”
Submitted in partial fulfilment for the award of degree of

BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE ENGINEERING
Submitted To Submitted By
Mr. Loveleen Kumar Harshit Kapoor(16EGJCS073)
Assistant Professor Kapil Kokcha (16EGJCS088)
Lakshya Sharma (16EGJCS100)
DEPARTMENT OF COMPUTER SCIENCE

ENGINEERING
GLOBAL INSTITUTE OF TECHNOLOGY
JAIPUR (RAJASTHAN)-302022 2019-20
Candidate’s Declaration
We hereby declare that the work, which is being presented in the minor project ,
entitled “Biometric Traits Evaluation” in partial fulfillment for the award of
Degree of “Bachelor of Technology” in Department of Computer Science and
Engineering submitted to the Department of Computer Science & Engineering,
Global Institute of Technology, Rajasthan Technical University is a record of my
own investigations carried under the Guidance of Shri Loveleen Kumar,
Department of Computer Science and Engineering, Global Institute of Technology.
We have not submitted the matter presented in this project anywhere for the award
of any other Degree.
Harshit Kapoor (16EGJCS073)
Kapil Kokcha (16EGJCS088)
Lakshya Sharma(15EGJCS100)
Global Institute of Technology, Jaipur
Counter Signed by
Loveleen Kumar Assistant Professor
Department of Computer Science and
Engineering Global Institute of
Technology, Jaipur
ii
ACKNOWLEDGEMENT
It is our pleasure to be indebted to various people, who directly or indirectly
contributed in the development of this work and who influenced my thinking,
behavior, and acts during the course of completion. This formal piece of
acknowledgement is an attempt to express the feeling of gratitude towards people
who help us in successful completion of my project.
We would like to express our deepest gratitude to Mr. Loveleen Kumar, our
project guide for his guidance, precious time and necessary advices. He was always
there with his competent support and valuable suggestion throughout the
development phase of the project.
A special thanks goes out to the HEAD OF DEPARTMENT (Computer Science

& Engineering) Mr. Girraj Khandelwal who has always given a patient hearing
to all out doubts and provided useful suggestions for completing the project and
report.
We would also like to thanks the supporting staff of Department of CSE (GIT) and
our parents who always supported us at every step, guided us, inspired us, and
provided us all facilities so that we can achieve our goals.
Harshit Kapoor (16EGJCS073)

Kapil Kokcha (16EGJCS088)
Lakshya Sharma(16EGJCS100)
IV Year,
VII Semester Computer
Science & Engineering
Global Institute of
Technology, Jaipur
iii
ABSTRACT
“Biometric Traits Evaluation” was built with an objective to study and identify the
best algorithm to work on with the up growing industry 4.0, which helps us to work
efficiently and accurately. The project is built with 4 different algorithms (1 inbuilt, 3
studied). The complete project inbuilt various technologies like machine learning,
python, etc. The users are also provided with other services in order to help them choose
algorithm for their work. The user can compare the algorithms according to his need on
already present data set or using his own personal data set. The system is built using
different machine learning algorithm which makes the system highly reliable.
According to their requirements user can compare and check the algorithms. The
algorithms are compared on the basis of their accuracy and various other factors. Not
only on faces, we can work on different Biometrics like fingerprints or eye retina. In all
it can be concluded that this comparison and work on different algorithms can help
users to save much time, money and efforts, also provide a feeling of secure and
satisfaction.
The Machine Learning field evolved from the large area of Artificial Intelligence, which
pursuits to mimic smart skills of people by means of machines. In the sphere of Machine
Learning one considers the crucial question of the way to make machines capable of
“study”. Learning on this context is thought as inductive inference, in which one
observes examples that represent incomplete records approximately a few “statistical
phenomenon”. In unsupervised studying one generally attempts to discover hidden
regularities (e.G. Clusters) or to hit upon anomalies within the records (as an instance
some unusual gadget feature or a community intrusion). In supervised gaining
knowledge of, there is a label related to each instance. It is meant to be the solution to
a query approximately the instance. If the label is discrete, then the challenge is known
as class hassle – in any other case, for real valued labels we communicate of a regression
problem. Based on these examples (consisting of the labels), one is in particular
interested to expect the solution for different instances before they are explicitly
discovered. Hence, mastering isn't always simplest a question of remembering however
also of generalization to unseen instances.
iv
Machine Learning research has been extremely energetic the last few years. The result
is a huge variety of very correct and efficient algorithms that are quite easy to use for a
practitioner. It seems worthwhile and nearly mandatory for (laptop) scientist and
engineers to learn the way and where Machine Learning can help to automate duties or
offer predictions where human beings have problems to recognise huge amounts of
facts. The long listing of examples wherein Machine Learning techniques have been
successfully carried out consists of: Text category and categorization (as an example
junk mail filtering), network intrusion detection, Bioinformatics (e.G. Most cancers
tissue classification, gene locating, tracking of electric appliances, optimization of
tough disk caching strategies, high-energy physics particle class, reputation of hand
writing, natural scene evaluation and so forth. Some different algorithms encompass
regression algorithms (e.g. Ridge regression, regression bushes), unsupervised studying
algorithms (along with clustering, precept aspect analysis), reinforcement getting to
know, on line studying algorithms or model-choice issues. Some of those strategies
extend the applicability of Machine Learning algorithms extensively and would every
require an advent for them self.
v
TABLE OF CONTENTS
TOPICS PAGE NO.
ACKNOWLEDGEMENT ii
CANDIDATE DECLARATION iii
ABSTRACT iv
LIST OF CONTENT vi
LIST OF FIGURES vii
LIST OF TABLES viii
CHAPTER 1: INTRODUCTION 1
1.1 Differences between Data Mining,
Machine Learning and Deep Learning? 2
1.2 Need of Machine Learning 3
1.3 Goals of Machine Learning 3
1.4 Why the goals of ML are important or desirable 4
1.5 Uses of Machine Learning 4
CHAPTER 2: MACHINE LEARNING WORKFLOW 6
2.1 Data Processing 7
2.2 Training Sets creation 8
2.3 Machine Learning Algorithms testing, evaluation and selection 8
2.4 Deployment and A/B testing 8
CHAPTER 3: APPLICATIONS OF MACHINE LEARNING 10
3.1 Uses of machine learning 10
3.2 Prominent Sectors the usage of ML 11
CHAPTER 4: METHODS OF CLASSIFICATION 13
4.1 Common Techniques in Data Classification 16
CHAPTER 5: CLASSIFIERS AND MODELS 19
5.1 Decision Tree based methods 19
5.2 Linear regression based methods 21
5.3 Neural Network 22
5.4 Bayesian Network 23
5.5 Support Vector Machine 24
vi
5.6 Nearest Neighbor 25
CHAPTER 6: MODELS OF MACHINE LEARNING 27

6.1 Definition of the problem 27
6.2 Face Recognition
6.2.1 Facial recognition system 27
6.2.2 The History of Face Recognition 28
6.2.3 Current Scenario 28
6.2.4 Proposed System 29
6.2.5 Features of Proposed System 29
6.3 Project Scope 29
6.4 Future Scope 30
6.5 Organization of the Report 30
CHAPTER 7: FACE DETECTION & FACE RECOGNITION 31
7.1. Introduction 31
7.2 Face Detection using Haar-Cascades 31
7.3. Face Recognition 32
7.3.1 Eigen face
7.3.2 Fisherface
7.3.3 Local Binary Pattern Histogram
7.4. Methodology 35
7.4.1 Face Detection 40
7.4.2 Face Recognition Process 41
CHAPTER 8: CONCLUSION 45
REFERENCES 47
List of figure
vii
Fig No. Name of Figures Page No.
1 2.1: Machine learning workflow 9

2 4.1: Feature selection process 17
3 5.1: Feature selection process 20
4 5.2: Linear Regression 22
5 5.3: Neural Network 23
6 5.4: Bayesian Network 24
7 5.5: Nearest Neighbor 25
viii
MODELS AND CLASSIFIERS: MACHINE LEARNING GIT DCSE
CHAPTER I
INTRODUCTION
Because of new computing technologies, machine learning today is not like machine
learning of the past. It was born from pattern recognition and the theory that computers
can learn without being programmed to perform specific tasks; researchers interested
in artificial intelligence wanted to see if computers could learn from data. The iterative
aspect of machine learning is important because as models are exposed to new data,
they are able to independently adapt. They learn from previous computations to produce
reliable, repeatable decisions and results. It’s a science that’s not new – but one that has
gained fresh momentum.
While many machine learning algorithms have been around for a long time, the ability
to automatically apply complex mathematical calculations to big data – over and over,
faster and faster – is a recent development. Here are a few widely publicized examples
of machine learning applications you may be familiar with:
• The heavily hyped, self-driving Google car? The essence of machine learning.
• Online recommendation offers such as those from Amazon and Netflix? Machine
learning applications for everyday life.
• Knowing what customers are saying about you on Twitter? Machine learning
combined with linguistic rule creation.
• Fraud detection? One of the more obvious, important uses in our world today.
Machine learning is a method of data analysis that automates analytical model building.
It is a branch of artificial intelligence based on the idea that systems can learn from data,
identify patterns and make decisions with minimal human intervention.
Machine gaining knowledge of is a subfield of artificial intelligence (AI). The goal of

system mastering typically is to apprehend the structure of data and in shape that
information into models that can be understood and used by humans. Although machine
gaining knowledge of is a field within laptop science, it varies from traditional
Session 2015-2019 Page 1

computational methods. In traditional computing, algorithms are units of explicitly

programmed commands utilized by computers to calculate or resolve hassle. Machine
studying algorithms in its location permit for computers to educate on facts inputs and
use statistical exam with the intention to output values that fall inside a particular range.
Because of this, gadget mastering assists computer systems in constructing fashions
from sample statistics if you want to automate selection-making methods based on facts
inputs.
Machine Learning is an idea to examine from examples and enjoy, without being
explicitly programmed. Instead of writing code, you feed data to the generic algorithm,
and it constructs good judgment primarily based at the facts given.
1.1 Differences between Data Mining, Machine Learning and Deep

Learning?
Although all of these methods have the same goal – to extract insights, patterns and
relationships that can be used to make decisions – they have different approaches and
abilities.
Data Mining
Data mining can be considered a superset of many different methods to extract insights
from data. It might involve traditional statistical methods and machine learning. Data
mining applies methods from many different areas to identify previously unknown
patterns from data. This can include statistical algorithms, machine learning, text
analytics, time series analysis and other areas of analytics. Data mining also includes
the study and practice of data storage and data manipulation.
Machine Learning
The main difference with machine learning is that just like statistical models, the goal
is to understand the structure of the data – fit theoretical distributions to the data that
are well understood. So, with statistical models there is a theory behind the model that
is mathematically proven, but this requires that data meets certain strong assumptions
too. Machine learning has developed based on the ability to use computers to probe the

data for structure, even if we do not have a theory of what that structure looks like. The
test for a machine learning model is a validation error on new data, not a theoretical test
that proves a null hypothesis. Because machine learning often uses an iterative approach
to learn from data, the learning can be easily automated. Passes are run through the data
until a robust pattern is found.
Deep learning
Deep learning combines advances in computing power and special types of neural
networks to learn complicated patterns in large amounts of data. Deep learning
techniques are currently state of the art for identifying objects in images and words in
sounds. Researchers are now looking to apply these successes in pattern recognition to
more complex tasks such as automatic language translation, medical diagnoses and
numerous other important social and business problems.
1.2 Need of Machine Learning
Machine Learning is a department which is raised out of Artificial Intelligence (AI).

Applying AI, we wanted to construct advanced or shrewd machines. But except for few
mere responsibilities which include locating the shortest path between point A and B,
we have been unable to program greater difficult and consistently evolving challenges.
There turned into a realization that the simplest manner as a way to obtain this
assignment become to allow machine research from itself. It seems like a baby studying
from itself.
So system learning changed into evolved as a new functionality for computer systems.
And now gadget studying is found in so many divisions of generation, that we don’t
even realize it at the same time as the usage of it.
1.3 Goals of Machine Learning
The purpose of ML, in simples words, is to apprehend the character of (human and other
kinds of) learning, and to shape getting to know functionality in computers. To be more
precise, there are 3 capabilities of the desires of ML.

• To make the computer systems smarter, extra wise. The more direct goal in this
factor is to increase systems (applications) for precise realistic gaining knowledge
of tasks in application domain names.
• To cultivate computational models of human getting to know method and put into
effect laptop simulations.
• To explore new gaining knowledge of methods and increase preferred learning

algorithms unbiased of applications.
1.4 Why the goals of ML are important or desirable
Machine studying has several practical uses that force the kind of actual enterprise
outcomes – consisting of money and time financial savings – which have the potential
to dramatically have an effect on the future of your organization. Researches or surveys
in particular show amazing impact occurring within the purchaser care industry,
whereby device studying is letting people to get things executed greater speedy and
efficaciously.
Through Virtual Assistant answers, machine getting to know automates responsibilities

that would else need to be achieved through a live agent – such as changing a password
or analyzing an account balance. This frees up precious agent time that may be used to
pay interest on the type of purchaser care that humans perform pleasant: excessive
contact, complicated decision-making that isn't as easily treated via a device.
1.5 Uses of Machine Learning
Conventionally, information analysis became usually being characterized by way of

trial and errors, a method that becomes hard when information units are large and
heterogeneous. Machine learning comes as the decision to all this chaos by means of
recommending cunning alternatives to investigate massive volumes of records. By
developing speedy and green algorithms and information-driven models for real-time
processing of data, device learning is capable to supply specific consequences and
analysis.

Machine gaining knowledge of has been used for photograph, video, and textual content
reputation, as well as serving because the energy behind recommendation engines.
Today, it's being used to strengthen cyber security, ensure public safety, and improve
medical outcomes. It can also assist increase customer service and make cars safer.
To better understand the uses of machine studying, keep in mind a number of the
instances wherein gadget studying is carried out: the self-riding Google vehicle, cyber
fraud detection, online recommendation engines—like pal guidelines on Facebook,
Netflix showcasing the movies and indicates you might like, and “more gadgets to
recall” and “get yourself a touch something” on Amazon—are all examples of carried
out gadget learning. All those examples without a doubt country the essential position
system getting to know has started to take in these days data-wealthy international.
Machines can help in filtering beneficial portions of information that assist in principal
tendencies, and we're already seeing how this generation is being carried out in a huge
range of industries.

CHAPTER II
MACHINE LEARNING WORKFLOW
Machine gaining knowledge of uses algorithms that get educated from records to help
make greater decisions; however, it is not always apparent that's the excellent gadget
learning algorithm going to be for a positive hassle. Luckily, records which include
variable importance and model evaluation tools can help us pick out which machine
studying strategies to use.
Machine learning algorithms are often categorized as supervised or unsupervised.

Supervised algorithms require a data scientist or data analyst with machine learning
skills to provide both input and desired output, in addition to furnishing feedback about
the accuracy of predictions during algorithm training. Data scientists determine which
variables, or features, the model should analyze and use to develop predictions. Once
training is complete, the algorithm will apply what was learned to new data.
Some machine learning methods
Machine learning algorithms are often categorized as supervised or unsupervised.
• Supervised machine learning algorithms can apply what has been learned in the past
to new data using labeled examples to predict future events. Starting from the
analysis of a known training dataset, the learning algorithm produces an inferred
function to make predictions about the output values. The system is able to provide
targets for any new input after sufficient training. The learning algorithm can also
compare its output with the correct, intended output and find errors in order to
modify the model accordingly.
• In contrast, unsupervised machine learning algorithms are used when the

information used to train is neither classified nor labeled. Unsupervised learning
studies how systems can infer a function to describe a hidden structure from
unlabeled data. The system doesn’t figure out the right output, but it explores the
data and can draw inferences from datasets to describe hidden structures from
unlabeled data.

• Semi-supervised machine learning algorithms fall somewhere in between

supervised and unsupervised learning, since they use both labeled and unlabeled
data for training – typically a small amount of labeled data and a large amount of
unlabeled data. The systems that use this method are able to considerably improve
learning accuracy. Usually, semi-supervised learning is chosen when the acquired
labeled data requires skilled and relevant resources in order to train it, learn from it.
Otherwise, acquiring unlabeled data generally doesn’t require additional resources.
• Reinforcement machine learning algorithms is a learning method that interacts with

its environment by producing actions and discovers errors or rewards. Trial and
error search and delayed reward are the most relevant characteristics of
reinforcement learning. This method allows machines and software agents to
automatically determine the ideal behavior within a specific context in order to
maximize its performance. Simple reward feedback is required for the agent to learn
which action is best; this is known as the reinforcement signal.
2.1 Data Processing
The challenge is related to statistics acquisition, statistics evaluation, statistics

normalization, information transformation, after which facts integration, information
filtering/cutting/discount with which we are able to create a series of different education
units or schooling corpuses to be able to then result in the making (after proper splitting)
of the schooling, validation and check sets. This step is very massive because the
highquality and extent of statistics that you collect will at once decide how true your
predictive model may be. Data instruction, in which we load our data into a appropriate
place and prepare it for use in our machine studying schooling. This is also an awesome
time to do any applicable visualization of your facts, to help you see if there are any
associated relationships among one of a kind variables you may take gain of, in addition
to display you if there are any facts imbalances. We’ll also need to divide the
information in elements. The first element, utilized in training our model, may be the
bulk of the dataset. The 2nd component could be used for calculating our educated
version’s overall performance.

2.2 Training Sets creation
The version is mostly in shape on a schooling dataset that is a set of examples used to
match the parameters of the model. The version is educated at the training dataset using
a supervised mastering method. In exercise, the education dataset frequently contain
pairs of an input vector (or scalar) and the corresponding output vector (or scalar), that's
typically denoted as the target (or label). The present day model is administered with
the training dataset and produces a end result, which is then in comparison with the
target, for each enter vector inside the training dataset. Based on the outcome of the
assessment and the specific getting to know algorithm getting used, the parameters of
the version are adjusted. The version becoming can encompass both variable selection
and parameter estimation.
2.3 Machine Learning Algorithms testing, evaluation and selection
The next step in our workflow is choosing a model. There are several models that
researchers and facts scientists have created over the years. Some are very well
desirable for photo facts, others for sequences (like text, or music), some for numerical
information, and others for textual content-primarily based data. It is most effective at
this point that the statistics scientists will start testing algorithms to build special
fashions, to evaluate them and to satisfactory-music the hyper-parameters so that the
great version(s) are selected. In applied machine learning, person algorithms need to be
swapped in and out relying on which performs quality for the hassle and the dataset.
Therefore, we are able to attention on belief and sensible blessings over math and
principle.
2.4 Deployment and A/B testing
Once schooling is whole, it’s time to look if the model is of any properly, the use of
evaluation. This is wherein that dataset that we set aside earlier comes into play.
Evaluation allows us to check our version in opposition to facts that has never been
used for training. This metric permits us to see how the model may carry out towards

statistics that it has no longer yet visible. This is meant to be representative of how the
model might carry out within the actual global. A good rule of thumb you'll use for a
schooling-assessment split somewhere on the order of 80/20 or 70/30. Much of that is
motivated via the dimensions of the unique supply dataset. If you've got plenty of facts,
perhaps you don’t want as big of a fragment for the assessment dataset.
Figure 2.1: Machine learning workflow
One has no concept what algorithm will work fine on a problem before start. Even
professional data scientists cannot inform you. This hassle isn't restricted to the
selection of system learning algorithms. You cannot recognize what records transforms
and what features within the information that if uncovered could satisfactory present
the shape of the trouble to the algorithms.
One may also have some thoughts. One may additionally have some favored techniques.
But how does one recognize that the techniques that got you consequences remaining
time will get you true results this time? How does one recognize that the techniques are
transferable from one problem to every other? The solution to these complete questions
relies upon on the problem and dataset. The selection of set of rules differs from one
dataset to some other and sort of problem.

CHAPTER III
APPLICATION OF MACHINE LEARNING
3.1 Uses of machine learning
Machine getting to know, reorganized as a separate subject, commenced to flourish

inside the 1990s. The subject changed its goal from attaining synthetic intelligence to
tackling solvable problems of a practical nature. It shifted focus away from the symbolic
tactics it had inherited from AI, and towards strategies and models borrowed from facts
and possibility concept. It also benefited from the growing availability of digitized
statistics, and the ability to distribute it thru the Internet.
Relation to data mining
Machine learning and statistics mining regularly appoint the equal methods and overlap
significantly, however while gadget mastering makes a specialty of prediction, based
totally on recognized houses found out from the education facts, data mining focuses
on the invention of (formerly) unknown properties in the statistics (that is the evaluation
step of expertise discovery in databases). Data mining makes use of many gadget
studying strategies, but with special goals; then again, gadget gaining knowledge of
additionally employs statistics mining methods as "unsupervised learning" or as a
preprocessing step to enhance learner accuracy. Much of the confusion among these
studies groups (which do regularly have separate meetings and separate journals,
ECML PKDD being a main exception) comes from the fundamental assumptions they
work with: in machine mastering, performance is normally evaluated with recognize to
the capacity to reproduce recognized know-how, whilst in information discovery and
information mining (KDD) the important thing project is the discovery of previously
unknown expertise. Evaluated with recognize to known understanding, an uninformed
(unsupervised) technique will effortlessly be outperformed by other supervised

methods, even as in a standard KDD assignment, supervised techniques cannot be used

because of the unavailability of education records.
Relation to optimization
Machine getting to know also has intimate ties to optimization: many learning issues
are formulated as minimization of some loss characteristic on a training set of examples.
Loss features specific the discrepancy among the predictions of the model being trained
and the actual hassle instances (as an instance, in type, one desires to assign a label to
times, and fashions are trained to properly expect the pre-assigned labels of a fixed of
examples). The distinction among the two fields arises from the goal of generalization:
even as optimization algorithms can limit the loss on a schooling set, gadget studying
is involved with minimizing the loss on unseen samples.
Relation to statistics
Machine studying and facts are intently associated fields. According to Michael I.
Jordan, the thoughts of gadget getting to know, from methodological ideas to theoretical
equipment, have had a long pre-records in records. He also cautioned the time period
facts science as a placeholder to name the overall area.
3.2 Prominent Sectors the usage of ML
Most industries operating with big volumes of facts have identified the value of machine
studying generation. By accumulating insights from this data often in actual time
companies are able to work extra efficaciously or earn an advantage over contenders.
Most industries working with large amounts of data have recognized the value of
machine learning technology. By gleaning insights from this data – often in real time –
organizations are able to work more efficiently or gain an advantage over competitors.
Financial services
Banks and different companies in the monetary industry use system gaining knowledge
of era for two key purposes: to understand important insights in records, and save you
fraud. The insights can recognize investment possibilities, or help buyers realize when

to trade. Data mining can also identify clients with high-danger profiles, or use cyber
surveillance to pinpoint caution symptoms of fraud.
Government
Government organizations consisting of public safety and services have a particular

need for system learning considering that they have got several resources of facts that
may be mined for insights. Analyzing sensor records, for example, identifying
approaches to growth efficiency and store money. Machine studying can also assist spot
fraud and minimize identification robbery.
Health care
Machine mastering is a fast-growing fashion within the fitness care enterprise, way to
the arrival of wearable gadgets and sensors which could use records to assess a patient's
health in real time. The generation also can assist medical experts analyze statistics to
pick out trends or pink flags which could result in higher-quality diagnoses and remedy.
Retail
Websites suggesting objects you might like primarily based on in advance purchases
are using machine mastering to investigate your shopping records. Retailers depend on
gadget mastering to capture statistics, analyze it and use it to personalize a shopping
revel in, implement a advertising marketing campaign, rate optimization, merchandise
supply planning, and for client insights.
Transportation
Analyzing records to recognize styles and traits is prime to the transportation industry,
which is based on making routes greater efficient and predicting capacity issues to boom
profitability. The statistics evaluation and modelling elements of device gaining
knowledge of are essential tools to delivery organizations, public transportation and
other transportation groups.

Marketing and sales
Websites recommending items you might like based on previous purchases are using
machine learning to analyze your buying history – and promote other items you'd be
interested in. This ability to capture data, analyze it and use it to personalize a shopping
experience (or implement a marketing campaign) is the future of retail.

CHAPTER IV
METHODS OF CLASSIFICATION
The problem of information classification has numerous programs in a huge variety of

mining packages. This is because the hassle attempts to research the connection among
a set of feature variables and a target variable of interest. Since many realistic problems
may be expressed as institutions between feature and target variables, this gives a large
range of applicability of this model. The trouble of classification maybe stated as
follows:
Given a fixed of training statistics points at the side of related training labels, decide the
magnificence label for an unlabeled test example.
Numerous versions of this problem may be defined over distinctive settings. Excellent
overviews on statistics classification may be located in [39, 50, 63, and 85].
Classification algorithms usually contain two phases:
• Training Phase: In this phase, a version is made out of the education instances.
• Testing Phase: In this phase, the version is used to assign a label to an unlabeled check
instance.
In a few cases, consisting of lazy getting to know, the training segment is neglected
absolutely, and the classification is achieved immediately from the relationship of the
education times to the check example. Instance-primarily based methods including the
nearest neighbor classifiers are examples of one of these scenario. Even in such cases,
a pre-processing segment along with a nearest neighbor index construction may be
completed so as to ensure efficiency at some stage in the checking out phase. The output
of a classification set of rules can be provided for a test example in certainly one of
approaches:
1. Discrete Label: In this situation, a label is back for the take a look at example.
2. Numerical Score: In this example, a numerical score is returned for each

magnificence label and test instance aggregate.

Note that the numerical rating can be transformed to a discrete label for a check
example, by way of choosing the magnificence with the highest rating for that test
instance. The advantage of a numerical rating is that it now becomes possible to
evaluate the relative propensity of different test instances to belong to a selected
elegance of importance, and rank them if wished. Such methods are used regularly in
uncommon class detection issues, in which the original class distribution is particularly
imbalanced, and the invention of a few instructions is more valuable than others. The
classification hassle hence segments the unseen test times into businesses, as defined
by the class label. While the segmentation of examples into businesses is also completed
by way of clustering, there's a key difference between the two problems. In the case of
clustering, the segmentation is completed the use of similarities among the function
variables, without a prior knowledge of the structure of the companies. In the case of
classification, the segmentation is completed on the basis of a schooling data set, which
encodes information approximately the structure of the corporations within the shape
of a target variable. Thus, at the same time as the segmentations of the records are
usually associated with notions of similarity, as in clustering, significant deviations
from the similarity-based segmentation can be completed in sensible settings. As a
result, the classification hassle is called supervised getting to know, simply as clustering
is known as unsupervised studying. The supervision procedure of ten provides
significant application specific software, due to the fact the class labels may additionally
constitute essential residences of interest. Some common utility domain names in which
the classification trouble arises, are as follows:
• Customer Target Marketing: Since the classification hassle relates characteristic

variables to target classes, this method is extraordinarily famous for the hassle of client
goal marketing.
In such cases, feature variables describing the patron can be used to expect their buying
pastimes on the basis of previous schooling examples. The goal variable may
additionally encode the buying interest of the customer.
• Medical Disease Diagnosis: In recent years, the use of facts mining techniques
in clinical era has won increasing traction. The capabilities can be extracted from the
medical records, and the magnificence labels correspond to whether or not or now not

an affected person may additionally select up a disorder within the future. In those
cases, its miles appropriate to make sickness predictions with using such facts.
• Supervised Event Detection: In many temporal scenarios, elegance labels can

be associated with time stamps similar to unusual occasions. For example, an intrusion
pastime can be represented as a category label. In such cases, time-collection
classification techniques can be very beneficial.
• Multimedia Data Analysis: It is regularly suitable to carry out classification of

large volumes of multimedia facts which includes images, movies, audio or other extra
complicated multimedia statistics. Multimedia statistics evaluation can frequently be
difficult, because of the complexity of the underlying characteristic space and the
semantic hole among the characteristic values and corresponding inferences.
• Biological Data Analysis: Biological information is regularly represented as

discrete sequences, in which its miles appropriate to expect the homes of unique
sequences. In a few cases, the biological information is likewise expressed in the form
of networks.
Therefore, classification techniques may be implemented in a diffusion of different
methods in this state of affairs.
• Document Categorization and Filtering: Many packages, which include

newswire offerings, require the classification of huge numbers of documents in actual
time. This software is called file categorization, and is a crucial location of research in
its personal right.
• Social Network Analysis: Many forms of social network evaluation, inclusive

of collective classification, companion labels with the underlying nodes. These are then
used a good way to predict the labels of different nodes. Such packages are very useful
for predicting useful properties of actors in a social community. The variety of issues
that can be addressed by using classification algorithms is significant, and covers many
domains.
• Technique-targeted: The trouble of records classification can be solved using

several lessons of techniques consisting of decision trees, rule-primarily based

techniques, neural networks, SVM techniques, nearest neighbor strategies, and

probabilistic methods.
• Data-Type Centered: Many specific data kinds are created by way of

exceptional programs. Some examples of different statistics kinds encompass textual
content, multimedia, uncertain facts, time collection, discrete sequence, and community
facts. Each of those extraordinary statistics sorts requires the design of various
techniques, every of which may be pretty one-of-a-kind.
• Variations on Classification Analysis: Numerous versions on the same old

classification problem exist, which cope with greater challenging scenarios inclusive of
uncommon elegance learning, switch studying, semi-supervised learning, or active
studying. Alternatively, distinct versions of classification, which includes ensemble
analysis, may be used as a way to improve the effectiveness of classification algorithms.
These troubles are of course intently related to issues of model assessment. All those
problems could be discussed significantly on this e book.
4.1 Common Techniques in Data Classification
Feature Selection Methods
The first phase of surely all classification algorithms is that of function choice. In most
facts mining eventualities, a huge variety of functions are amassed by way of folks who
are frequently now not domain specialists. Clearly, the irrelevant capabilities can also
frequently bring about negative modeling, considering the fact that they are not properly
related to the class label. In reality, such features will typically get worse the
classification accuracy because of over fitting, while the education facts set is small and
such capabilities are allowed to be part of the schooling version. For instance, do not
forget a clinical example where the features from the blood work of different patients
are used to predict a specific ailment. Clearly, a feature including the Cholesterol degree
is predictive of coronary heart sickness, while a feature1 which include PSA stage is
not predictive of coronary heart disease. However, if a small training statistics set is
used, the PSA degree may additionally have freak correlations with coronary heart
sickness because of random variations. While the impact of a single variable may be

small, the cumulative impact of many inappropriate function experiment be significant.

This will bring about a training version that generalizes poorly to unseen check times.
Therefore, it is important to use the ideal functions all through the education system.
There are broad kinds of function choice techniques:
Figure 4.1: Feature selection process
1. Filter Models: In these instances, a crisp criterion on a single characteristic, or

a subset of capabilities, is used to evaluate their suitability for classification. This
method is impartial of the specific algorithm getting used.
2. Wrapper Models: In those cases, the feature choice method is embedded right
into a classification algorithm, with a view to make the feature choice technique touchy
to the classification set of rules. This method acknowledges the truth that specific
algorithms may go higher with distinct features.
In order to perform function choice with filter models, some of exclusive measures are
used so that it will quantify the relevance of a characteristic to the classification method.
Typically, those measures compute the imbalance of the feature values over distinctive
stages of the characteristic, which may either be discrete or numerical. Some examples
are as follows:
Probabilistic methods
Probabilistic methods are the most essential amongst all records classification methods.
Probabilistic classification algorithms use statistical inference to find the high-quality
magnificence for a given example. In addition to definitely assigning the great class like
other classification algorithms, probabilistic classification algorithms will output a

corresponding posterior probability of the test instance being a member of each of the
viable lessons. The posterior possibility is defined because the chance after observing
the specific characteristics of the check example. On the other hand, the previous
probability is really the fraction of schooling facts belonging to every precise elegance,
with no knowledge of the test example. After acquiring the posterior chances, we use
decision theory to determine elegance club for each new instance. Basically, there are
methods in which we can estimate the posterior possibilities. In the first case, the
posterior possibility of a selected elegance is expected by figuring out the
classconditional possibility and the prior elegance one by one and then applying Bayes’
theorem to find the parameters. The maximum widely known among those is the Bayes
classifier, which is known as age native model.

CHAPTER V
CLASSIFIERS AND MODELS
In System getting to know and statistics, classification is a supervised learning method

wherein the computer program learns from the information enter given to it and then
makes use of this studying to categories new observation. This information set may
additionally genuinely be bi-magnificence (like figuring out whether the man or woman
is male or woman or that the mail is junk mail or non-unsolicited mail) or it can be
multi-elegance too. Some examples of category problems are: speech reputation,
handwriting reputation, and bio metric identity, file category and so on.
Here we have the types of classification algorithms in Machine Learning:

1. Here we have the linear Classifiers: Logistic Regression, Naive Bayes Classifier
2. Support Vector Machines
3. Decision Trees
4. Boosted Trees
5. Random Forest
6. Neural Networks
7. nearest Neighbor
5.1 Decision Tree based methods
The fundamental gaining knowledge of technique is to recursively divide the schooling

statistics into buckets of homogeneous participants thru the maximum discriminative
dividing criteria. The size of "homogeneity" is primarily based at the output label;
whilst it's far a numeric cost, the measurement can be the variance of the bucket; when
it is a class, the size will be the entropy or index of the bucket. During the learning,
diverse dividing standards based at the center can be tried (the usage of in a greedy
manner); when the input is a category (Mon, Tue, Wed ...), it'll first be was binary
(isMon, isTue, isWed ...) after which use the genuine/false as a selection boundary to
assess the homogeneity; whilst the input is a numeric or ordinal fee, the less Than,
greater Than at every schooling information input fee will be used as the decision

boundary. The training method stops whilst there is no sizable gain in homogeneity
with the aid of similarly cut up the Tree. The participants of the bucket represented at
leaf node will vote for the prediction; majority wins whilst the output is a category and
member common whilst the output is a numeric.
Figure 5.1: Decision tree
The exact a part of Tree is that its miles very bendy in terms of the statistics sort of enter
and output variables which may be categorical, binary and numeric price. The stage of
selection nodes additionally indicate the degree of influences of various input variables.
The predicament is every decision boundary at each cut up factor is a concrete binary
selection. Also the choice standards only keep in mind one input attributes at a time but
not a aggregate of multiple input variables. Another weak spot of Tree is that once
learned it cannot be up to date incrementally. When new training facts arrives, you need
to throw away the old tree and retrain every information from scratch.
However, Tree when mixed with Ensemble strategies (Random Forest, Boosting Trees)
addresses plenty of the limitations referred to above. For example, Gradient Boosting

Decision Tree continually beat the overall performance of other ML models in many
troubles and is one of the maximum famous approach in recent times.
5.2 Linear regression based methods
The basic assumption is that the output variable (a numeric cost) can be expressed as a
linear aggregate (weighted sum) of a fixed of enter variable (which is also numeric
cost). y = w1x1 + w2x2 + w3x3....
The complete objective of the education section is to learn the weights w1, w2 ... By
using minimizing the mistake feature lost(y, w1x1 + w2x2 + ...). Gradient descent is
the classical approach of solving this trouble with the general idea of adjusting w1, w2
... Alongside the course of the most gradient of the loss feature.
The input variable is required to be numeric. For binary variable, this could be
represented as 0, 1. For specific variable, every feasible price can be represented as a
separate binary variable (and consequently 0, 1). For the output, if it is a binary variable
(0, 1) then a log it characteristic is used to transform the range of -infinity to +infinity
into 0 to at least one. This is called logistic regression and a exceptional loss
characteristic (primarily based on most likelihood) is used.
To keep away from over fitting, regularization method (L1 and L2) is used to penalize
big value of w1, w2 ... L1 is with the aid of adding absolutely the fee of w1 into the loss
characteristic whilst L2 is through including the square of w1 into the loss feature. L1
has the belongings that it'll penalize redundant capabilities or beside the point
characteristic greater (with very small weight) and is a good tool to pick out especially
influential capabilities.
The power of linear model is that it has very excessive performance in each scoring and
mastering. The Stochastic gradient descent-primarily based learning algorithm is
notably scalable and might handle incremental mastering.

Figure 5.2: Linear Regression
The weak point of linear version is linear assumption of input functions, which is
frequently fake. Therefore, an essential function engineering attempt is required to
convert every input function, which normally worried domain expert. Another
commonplace manner is to throw special transformation functions 1/x, x^2, log(x)
within the wish that considered one of them will have a linear relationship with the
output. Linearity may be checked with the aid of observing whether the residual (y -
predicted) is typically allotted or now not (the use of the Plot with the Gaussian
distribution).
5.3 Neural Network
Neural Network can be considered as multiple layer of perceptron (each is a logistic

regression unit with multiple binary input and one binary output). By having multiple
layers, this is equivalent to: z = logit (v1.y1 + v2y2 + ...), while y1 = logit (w11x1 +
w12x2 + ...). This multi-layer model enables Neural Network to learn non-linear
relationship between input x and output z. The typical learning technique is "backward

error propagation" where the error is propagate from the output layer back to the input
layer to adjust the weight.
Figure 5.3: Neural Network
Notice that Neural Network expect binary input which means we need to transform
categorical input into multiple binary variable. For numeric input variable, we can
transform that into binary encoded 101010 string. Categorical and numeric output can
be transformed in a similar way.
5.4 Bayesian Network:
It is essentially a dependency graph where each node represents a binary variable and
every part (directional) represents the dependency courting. If Node A and Node B has
an part to Node C. This manner the in all likelihood of C is genuine relies upon on
specific combinations of the Boolean fee of A and B. Node C can factor to Node D and
now Node D relies upon on Node A and Node B as well.
The mastering is about locating at each node be part of-chance distribution of all
incoming edges. This is accomplished via counting the determined values of A, B and
C after which update the joint probability distribution table at Node C. Once we've the
opportunity distribution table at every node, then we can compute the probability of any
hidden node (output variable) from the determined nodes (enter variables) with the aid
of the usage of the Bayes rule.

Figure 5.4: Bayesian Network
The power of Bayesian community is it is especially scalable and may examine

incrementally because all we do is to be counted the determined variables and replace
the probability distribution table. Similar to Neural Network, Bayesian community
expects all records to be binary, categorical variable will want to be transformed into
more than one binary variable as described above. Numeric variable is normally not a
terrific fit for Bayesian community.
5.5 Support Vector Machine
Support Vector Machine takes numeric input and binary output. It is based on finding
a linear plane with maximum margin to separate two class of output. Categorical input
can be turned into numeric input as before and categorical output can be modeled as
multiple binary output.
With a different lost function, SVM can also do regression (called SVR). I haven't used
this myself so I can't talk much. The strength of SVM is it can handle large number of
dimensions. With the kernel function, it can handle non-linear relationship as well.

5.6 Nearest Neighbor
We are not mastering a model in any respect. The concept is to find K similar statistics
point from the training set and use them to interpolate the output cost, which is either
most of the people value for express output, or average (or weighted common) for
numeric output. K is a tunable parameter which desires to be move-tested to choose the
first-class price.
Figure 5.5: Nearest Neighbor
Nearest Neighbor require the definition of a distance function that's used to discover the
nearest neighbor. For numeric input, the commonplace practice is to normalize them by
way of minus the mean and divided through the usual deviation. Euclidean distance is
typically used while the input are impartial, otherwise distance (which account for
correlation among pairs of input functions) have to be used rather. For binary attributes,
Jacquard distance can be used. The power of K nearest neighbor is its simplicity as no
version desires to be taught. Incremental getting to know is automatic when more facts
arrives (and vintage records can be deleted as properly). Data, but, wishes to be prepared
in a distance-conscious tree such that locating the nearest neighbor is O (log N) in
preference to O (N). On the other hand, the weak spot of KNN is it doesn't deal with
excessive number of dimensions.

CHAPTER VI
DEVELOP A MACHINE LEARNING MODEL
6.1 Definition of the problem
Problem is very clear we have various demands for the best possible solution to
implement biometric traits. We are just implementing a fix algorithm for all possible
problems without comparing and knowing the accuracy of a fixed algorithm for that
particular problem. A fixed algorithm cannot give the best possible result and accuracy
for every problem. We should compare and have the proper knowledge of the problem
and its best possible solution. In our day to day life we come across various type of
technologies from identifying a person from their fingerprint to their eye retina. And
with growing technology the need of correct and efficient algorithm is faced. And that’s
how we got the idea of studying different algorithms and getting the best out of it for
different uses.
6.2 Face recognition
6.2.1 Facial recognition system
A facial recognition system is a technology capable of identifying or verifying a

person from a digital image or a video frame from a video source. There are
multiple methods in which facial recognition systems work, but in general, they
work by comparing selected facial features from given image with faces within a
database. It is also described as a Biometric Artificial Intelligence based application
that can uniquely identify a person by analysing patterns based on the person's
facial textures and shape. While initially a form of computer application, it has seen
wider uses in recent times on mobile platforms and in other forms of technology,
such as robotics. It is typically used as access control in security systems and can
be compared to other biometrics such as fingerprint or eye iris recognition systems.
Although the accuracy of facial recognition system as a biometric technology is
lower than iris recognition and fingerprint recognition, it is widely adopted due to
its contactless and non-invasive process. Recently, it has also become popular as a
commercial identification and marketing tool. Other applications include advanced
human-computer interaction, video surveillance, automatic indexing of images,

and video database, among others.
6.2.2 The History of Face Recognition
Face recognition began as early as 1977 with the first automated system being
introduced By Kanade using a feature vector of human faces. In 1983, Sirovich
and Kirby introduced the principal component analysis (PCA) for feature
extraction. Using PCA, Turk and Pentland Eigenface was developed in 1991 and
is considered a major milestone in technology [3]. Local binary pattern analysis
for texturerecognition was introduced in 1994 and is improved upon for facial
recognition later by incorporating Histograms (LBPH). In 1996 Fisherface was
developed using Linear discriminant analysis (LDA) for dimensional reduction
and can identify faces in different illumination conditions, which was an issue
in Eigenface method. Viola and Jones introduced a face detection technique using
HAAR cascades and ADABoost. In 2007, A face recognition technique was
developed by Naruniec and Skarbek using Gabor Jets that are similar to
mammalian eyes. In This project, HAAR cascades are used for face detection and
Eigenface, Fisherface are used for face recognition.
6.2.3 Current Scenario
Users have very little knowledge about the technologies and algorithms available
for them in the machine learning/deep learning. Moreover if users are aware
about any of the algorithm then its proper usage techniques need also be clarified.
If the model is trained with the incorrect data set or the wrong model is used for
the different purpose then it will never give the proper and required result to the
user. The users may use the algorithms which may under-fit or over-fit the model,
hence consuming more time and power with no guarantee of the proper and
required result.

6.2.4 Proposed System
Our system will help the users to first understand the available algorithms for
their requirements. Users do not have to get panic for the successful implantation
of their model within give time. According to the users need the correct data set
will be collected and if users have their own data set to work on then the users
will be provided with best algorithm to work on their data set. The model will be
trained with the best possible algorithm and the cases of data over-fitting and
under-fitting will be avoided with using the correct data set with the correct
algorithm.
6.2.5 Features of Proposed System
Geometry-based Methods uses the specialised edge and contour detectors to find
the location of a set of facial land-marks and to measure relative positions and
distances between them. The geometry based methods are faster and needs less
memory. Geometrical relationship between facial landmarks, or in other words
the spatial configuration of facial features. That means that the main geometrical
features of the face such as the eyes, nose and mouth are first located and then
faces are classified on the basis of various geometrical distances and angles
between features.
We are providing the users with 3 different algorithms namely Support vector
machine (SVM), Principle component analysis (PCA), Linear Discriminant
Analysis (LDA). They all have their own accuracy and ways to train and predict
the model. We have used Eigen face, Fisher face hence providing the fast and
accurate model with less consumption of memory.
6.3 Project Scope

A complete Study of the different Geometric based algorithms to cope up the
current Industry 4.0. After getting the results user will have the choice to pick the
best algorithm for their project instead of blindly working on any algorithm without
knowing its accuracy and precision. The model is trained and compared with high

precision using different geometric based algorithms. The user can compare the
accuracy on their own dataset or already available data set. It is easy to compare
for the user and it will save them time and money as well as user can work on other
required things instead of working and testing all algorithms.
6.4 Future Scope
At the initial stage the project is comparing and showing the results based on face
recognition system only. As a future scope for the project the project can be
implemented for different biometrics like finger print or eye retina. We do not have
to start the project from scratch to implement and train models for other biometrics,
instead small changes and retrain of the model with the new data set according to
the users requirement will give the user a clear idea about the model.
6.5 Organization of the Project
The purpose of this report is to inform the reader that it is helpful to both, the reader
and the writer. Report being the logically organised will be able to deliver the whole
content along with the concept and purpose of the project. Its aim is to deliver or
inform the features, agenda and vision along with the flow of project successfully. It
has been tried to explain briefly, containing enough information to distinguish it from
the other and existing projects.

CHAPTER VII
FACE DETECTION & FACE RECOGNITION
7.1 Introduction
The following document is a report on the mini project for Robotic visual perception
and autonomy. It involved building a system for face detection and face recognition
using several classifiers available in the open computer vision library(OpenCV). Face
recognition is a non-invasive identification system and faster than other systems since
multiple faces can be analysed at the same time. The difference between face detection
and identification is, face detection is to identify a face from an image and locate the
face. Face recognition is making the decision” whose face is it?”, using an image
database. In this project both are accomplished using different techniques and are
described below. The report begins with a brief history of face recognition. This is
followed by the explanation of HAAR-cascades, Eigen face, Fisher face and Local
binary pattern histogram (LBPH) algorithms. Next, the methodology and the results of
the project are described. A discussion regarding the challenges and the resolutions are
described. Finally, a conclusion is provided on the pros and cons of each algorithm and
possible implementations.
7.2 Face Detection using Haar-Cascades
A Haar wavelet is a mathematical fiction that produces square-shaped waves with a

beginning and an end and used to create box shaped patterns to recognise signals with
sudden transformations. An example is shown in figure 1. By combining several
wavelets, a cascade can be created that can identify edges, lines and circles with
different colour intensities. These sets are used in Viola Jones face detection technique
in 2001 and since then more patterns are introduced [10] for object detection as shown
in figure 1.

To analyse an image using Haar cascades, a scale is selected smaller than the target
image. It is then placed on the image, and the average of the values of pixels in each
section is taken. If the difference between two values pass a given threshold, it is
considered a match. Face detection on a human face is performed by matching a
combination of different Haar-like-features. For example, forehead, eyebrows and
eyes contrast as well as the nose with eyes as shown below in figure A single classifier
is not accurate enough. Several classifiers are combined as to provide an accurate face
detection system as shown in the block diagram below in figure 3.
Figure 1: A Haar wavelet and resulting Haar-like features.
in this project, a similar method is used effectively to by identifying faces and

eyes in combination resulting better face detection. Similarly, in viola Jones
method [7], several classifies were combined to create stronger classifiers. ADA
boost is a machine learning algorithm that tests out several week classifiers on a
selected location and choose the most suitable [7]. It can also reverse the direction
of the classifier and get better results if necessary [7]. 2. Furthermore, Weight-
update-steps can be updated
Figure 2: Several Haar-like-features matched to the features of authors face.

only on misses to get better performance. The cascade is scaled by 1.25 and re-iterated
in order to find different sized faces. Running the cascade on an image using
conventional loops takes a large amount of computing power and time. Viola Jones
[7] used a summed area table (an integral image) to compute the matches fast. First
developed in 1984 [11], it became popular after 2001 when Viola Jones implemented
Haar-cascades for face detection. Using an integral image enables matching features
with a single pass over the image.
Figure 3: Haar-cascade flow chart
7.3Face Recognition
The following sections describe the face recognition algorithms Eigenface,

Fisherface, Local binary pattern histogram and how they are implemented in
OpenCV.
7.3.1 Eigen face
Eigen face is based on PCA that classify images to extract features using a set of
images. It is important that the images are in the same lighting condition and the eyes
match in each image. Also, images used in this method must contain the same number
of pixels and in grayscale. For this example, consider an image with n x n pixels as
shown in figure 4. Each raw is concatenated to create a vector, resulting a 1×n2 matrix.
All the images in the dataset are stored in a single matrix resulting a matrix with
columns corresponding the number of images. The matrix is averaged (normalised)
to get an average human face. By subtracting the average face from each image vector
unique features to each face are computed. In the resulting matrix, each column is a

representation of the difference each face has to the average human face. A simplified
illustration can be seen in figure 4.
Figure 4: Pixels of the image are reordered to perform calculations for

Eigenface
The next step is computing the covariance matrix from the result. To obtain the Eigen
vectors from the data, Eigen analysis is performed using principal component
analysis. From the result, where covariance matrix is diagonal, where it has the
highest variance is considered the 1st Eigen vector. 2nd Eigen vector is the direction of
the next highest variance, and it is in 90 degrees to the 1st vector. 3rd will be the next
highest variation, and so on. Each column is considered an image and visualised,
resembles a face and called Eigenfaces. When a face is required to be recognised, the
image is imported, resized to match the same dimensions of the test data as mentioned
above. By projecting extracted features on to each of the Eigenfaces, weights can be
calculated. These weights correspond to the similarity of the features extracted from
the different image sets in the dataset to the features extracted from the input image.
The input image can be identified as a face by comparing with the whole dataset. By
comparing with each subset, the image can be identified as to which person it belongs
to. By applying a threshold detection and identification can be controlled to eliminate
false detection and recognition. PCA is sensitive to large numbers and assumes that

the subspace is linear. If the same face is analysed under different lighting conditions,
it will mix the values when distribution is calculated and cannot be effectively
classified. This makes to different lighting conditions poses a problem in matching
the features as they can change dramatically.
7.3.2 Fisherface
Fisherface technique builds upon the Eigenface and is based on LDA derived from
Ronald Fishers’ linear discriminant technique used for pattern recognition. However,
it uses labels for classes as well as data point information [6]. When reducing
dimensions, PCA looks at the greatest variance, while LDA, using labels, looks at an
interesting dimension such that, when you project to that dimension you maximise
the difference between the mean of the classes normalised by their variance [6]. LDA
maximises the ratio of the between-class scatter and within-class scatter matrices. Due
to this, different lighting conditions in images has a limited effect on the classification
process using LDA technique. Eigenface maximises the variations while Fisherface
maximises the mean distance between and different classes and minimises variation
within classes. This enables LDA to differentiate between feature classes better than
PCA and can be observed in figure 5 [12]. Furthermore, it takes less amount of space
and is the fastest algorithm in this project. Because of these PCA is more suitable for
representation of a set of data while LDA is suitable for classification.
Figure 5: The first component of PCA and LDA. Classes in PCA looks more
mixed than of LDA

7.3.3 Local Binary Pattern Histogram
Local binary patterns were proposed as classifiers in computer vision and in 1990 By
Li Wang [4]. The combination of LBP with histogram oriented gradients was
introduced in 2009 that increased its performance in certain datasets [5]. For feature
encoding, the image is divided into cells (4 x 4 pixels). Using a clockwise or counter-
clockwise direction surrounding pixel values are compared with the central as shown
in figure 6. The value of intensity or luminosity of each neighbour is compared with
the centre pixel. Depending if the difference is higher or lower than 0, a 1 or a 0 is
assigned to the location. The result provides an 8-bit value to the cell. The advantage
of this technique is even if the luminosity of the image
Figure 6: Local binary pattern histogram generating 8-bit number
is changed as in figure 7, the result is the same as before. Histograms are used in larger
cells to find the frequency of occurrences of values making process faster. By
analysing the results in the cell, edges can be detected as the values change. By
computing the values of all cells and concatenating the histograms, feature vectors
can be obtained. Images can be classified by processing with an ID attached. Input
images are classified using the same process and compared with the dataset and
distance is obtained. By setting up a threshold, it can be identified if it is a known or
unknown face. Eigenface and Fisherface compute the dominant features of the whole
training set while LBPH analyse them individually.

Figure 7: The results are same even if brightness is changed
7.4 Methodology
Below are the methodology and descriptions of the applications used for data
gathering, face detection, training and face recognition. The project was coded in
Python using a mixture of IDLE and PYCharm IDEs.
7.4.1Face Detection
First stage was creating a face detection system using Haar-cascades. Although,
training is required for creating new Haar-cascades, OpenCV has a robust set of Haar-
cascades that was used for the project. Using face-cascades alone caused random
objects to be identified and eye cascades were incorporated to obtain stable face
detection. The flowchart of the detection system can be seen in figure 8. Face and eye
Figure 8: The Flow chart of the face detection application
classifier objects are created using classifier class in OpenCV through the
cv2.CascadeClassifier() and loading the respective XML files. A camera object is
created using the cv2.VideoCapture() to capture images. By using the

CascadeClassifier.detectMultiScale() object of various sizes are matched and

location is returned. Using the location data, the face is cropped for further
verification. Eye cascade is used to verify there are two eyes in the cropped face. If
satisfied a marker is placed around the face to illustrate a face is detected in the
location.
7.4.2 Face Recognition Process
For this project three algorithms are implemented independently. These are Eigenface,
Fisherface and Linear binary pattern histograms respectively. All three can be
implemented using OpenCV libraries. There are three stages for the face recognition
as follows:
1. Collecting images IDs

2. Extracting unique features, classifying them and storing in XML files
3. Matching features of an input image to the features in the saved XML files
and predict identity
Collecting the image data

Collecting classification images is usually done manually using a photo editing
software to crop and resize photos. Furthermore, PCA and LDA requires the same
number of pixels in all the images for the correct operation. This time consuming and
a laborious task is automated through an application to collect 50 images with
different expressions. The application detects suitable expressions between 300ms,
straightens any existing tilt and save them. The Flow chart for the application is shown
in figure 9.

Figure 9: The Flowchart for the image collection
Application starts with a request for a name to be entered to be stored with the ID in
a text file. The face detection system starts the first half. However, before the
capturing begins, the application check for the brightness levels and will capture only
if the face is well illuminated. Furthermore, after the face is detected, the position of
the eyes are analysed. If the head is tilted, the application automatically corrects the
orientation. These two additions were made considering the requirements for
Eigenface algorithm. The Image is then cropped and saved using the ID as a filename
to be identified later. A loop runs this program until 50 viable images are collected
from the person. This application made data collection efficient.
Training the Classifiers

OpenCV enables the creation of XML files to store features extracted from datasets
using the FaceRecognizer class. The stored images are imported, converted to
grayscale and saved with IDs in two lists with same indexes. FaceRecognizer
objects are created using face recogniser class. Each recogniser can take in parameters
that are described below:

cv2.face. createEigenFaceRecognizer ()
1. Takes in the number of components for the PCA for crating Eigenfaces.
OpenCV documentationmentions 80 can provide satisfactory reconstruction
capabilities.
2. Takes in the threshold in recognising faces. If the distance to the likeliest
Eigenface is above thisthreshold, the function will return a -1, that can be used
state the face is unrecognisable cv2.face. createFisherfaceRecognizer ()
1. The first argument is the number of components for the LDA for the creation
of Fisherfaces.OpenCV mentions it to be kept 0 if uncertain.
2. Similar to Eigenface threshold. -1 if the threshold is passed.
cv2.face. createLBPHFaceRecognizer ()
1. The radius from the centre pixel to build the local binary pattern.
2. The Number of sample points to build the pattern. Having a considerable
number will slow downthe computer.
3. The Number of Cells to be created in X axis.
4. The number of cells to be created in Y axis.
5. A threshold value similar to Eigenface and Fisherface. if the threshold is
passed the object willreturn -1
Recogniser objects are created and images are imported, resized, converted into
numpy arrays and stored in a vector. The ID of the image is gathered from splitting
the file name, and stored in another vector. By using
FaceRecognizer.train(NumpyImage, ID) all three of the objects are trained. It
must be noted that resizing the images were required only for Eigenface and
Fisherface, not for LBPH. Next, the configuration model is saved as a XML file using
FaceRecognizer.save(FileName). In this project, all three are trained and saved

through one application for convenience. The flow chart for the trainer is shown in
figure 10.
Figure 10: Flowchart of the training application
The Face Recognition
Face recogniser object is created using the desired parameters. Face detector is used
to detect faces in the image, cropped and transferred to be recognised. This is done
using the same technique used for the image capture application. For each face
detected, a prediction is made using FaceRecognizer.predict() which return the ID
of the class and confidence. The process is same for all algorithms and if the
confidence his higher than the set threshold, ID is -1. Finally, names from the text file
with IDs are used to display the name and confidence on the screen. If the ID is -1,
the application will print unknown face without the confidence level. The flow chart
for the application is shown in figure 11.
Figure 11: Flowchart of the face recognition application

CHAPTER VII
CONCLUSION
In the close to destiny, the sector is set to witness brilliant increase in Smart Apps,
Virtual Assistants, and significant use of Artificial Intelligence. The cellular
marketplace will expand via using device gaining knowledge of, and we are able to
quickly enter the generation of self-driven vehicles (they've already been launched for
testing and trials). Machine Learning is already an exceedingly powerful tool which has
been solving complicated problems. Although new Machine Learning gear would pop
up now and then, the talents required to music them and jazz those up could for all time
be in call for. Regarding activity possibilities Machine Learning has a widespread role
to play, there may be no element of lifestyles in which Machine Learning has now not
left its mark. As the quantity of facts proliferates, the need for Engineers and Scientists
has increased and will keep growing. In order to apprehend and control the subtleties
and pitfalls of Machine Learning, personnel might be required because what appears as
nicely tuned, simple machine is capable of leading you off target out of your desired
effects. Companies and industries heavily rely upon Machine Learning, and so you have
an incredible possibility in the field. The demand for Machine Learning Engineers
would keep growing, and you may get in at the movement. The group including Google,
Quora, and Facebook lease those who recognize gadget studying. There is intensive
research taking place in machine getting to know within the top universities of the
world. There isn't any higher limit to the earnings of machine gaining knowledge of
experts within the pinnacle businesses. Machine Learning is a harbinger of capacity
increase of people and financial system. So a ways now we have simply removed the
veneer from the floor. There is a whole lot greater that Machine Learning has yet to
achieve and introduce. There is hardly ever any application for which Machine Learning
can't be used for detection and prediction. Despite the contradictions in perspectives, its
miles assured that during destiny, the space between demand-supply in Data Science
and Machine Learning Skills may want to only be bridged by way of imparting the body
of workers that may deal with Machine Learning’s intricacies, given the advantages of
Machine Learning. The agencies would plunge into faucet algorithm fashions that could

enhance and enhance their operations and consumer-going through features. The
algorithms would take the commercial enterprise to whole new tiers. We have already
seen how era has replaced human beings in financial market and lots of different areas
for the better, starting off the load of bulky and exertions-intensive paintings from
human shoulders. It, therefore, wouldn’t be wrong to mention that Machine Learning
has a shiny destiny in advance that would assist human beings enter a new modified
era. Over time greater dramatic evolution could convey in extra highquality changes.

REFERENCES
1. Jayesh Bapu Ahire : August 24, 2018 : The artificial Neural Networks
https://www.datasciencecentral.com/profiles/blogs/the-artificial-neural-networks
2. Stephen DeAngelis : November 04, 2014 : Machine learning
https://www.enterrasolutions.com/blog/machine-learning-short-primer/
http://fgiasson.com/blog/index.php/2017/03/10/a-machine-learning-workflow/ 3. Mariane
Davids : Jan 9, 2017 7:00:00 AM: Applications of machine learning
https://blog.robotiq.com/5-applications-of-machine-learning
4. Bernard Marr Feb 19, 2016: A short history of machine Learning
https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine- learning-every-
manager-should-read/#74ed2c0815e7
5. Narayanan, Arvind (August 24l, 2016). "Language necessarily contains human biases, and so
will machines trained on language corpora". Freedom to Tinker.

Minor Report

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Minor Report

Uploaded by

Copyright:

Available Formats

A

“BIOMETRIC TRAITS EVALUATION:

Submitted in partial fulfilment for the award of degree of

DEPARTMENT OF COMPUTER SCIENCE

A special thanks goes out to the HEAD OF DEPARTMENT (Computer Science

Harshit Kapoor (16EGJCS073)

LIST OF TABLES viii

CHAPTER 6: MODELS OF MACHINE LEARNING 27

1 2.1: Machine learning workflow 9

Machine gaining knowledge of is a subfield of artificial intelligence (AI). The goal of

Session 2015-2019 Page 1

computational methods. In traditional computing, algorithms are units of explicitly

1.1 Differences between Data Mining, Machine Learning and Deep

Session 2015-2019 Page 2

1.2 Need of Machine Learning

Machine Learning is a department which is raised out of Artificial Intelligence (AI).

1.3 Goals of Machine Learning

Session 2015-2019 Page 3

• To explore new gaining knowledge of methods and increase preferred learning

1.4 Why the goals of ML are important or desirable

Through Virtual Assistant answers, machine getting to know automates responsibilities

1.5 Uses of Machine Learning

Conventionally, information analysis became usually being characterized by way of

Session 2015-2019 Page 4

Session 2015-2019 Page 5

MACHINE LEARNING WORKFLOW

Machine learning algorithms are often categorized as supervised or unsupervised.

Some machine learning methods

Machine learning algorithms are often categorized as supervised or unsupervised.

• In contrast, unsupervised machine learning algorithms are used when the

Session 2015-2019 Page 6

• Semi-supervised machine learning algorithms fall somewhere in between

• Reinforcement machine learning algorithms is a learning method that interacts with

2.1 Data Processing

The challenge is related to statistics acquisition, statistics evaluation, statistics

Session 2015-2019 Page 7

2.2 Training Sets creation

2.3 Machine Learning Algorithms testing, evaluation and selection

2.4 Deployment and A/B testing

Session 2015-2019 Page 8

Figure 2.1: Machine learning workflow

Session 2015-2019 Page 9

APPLICATION OF MACHINE LEARNING

3.1 Uses of machine learning

Machine getting to know, reorganized as a separate subject, commenced to flourish

Relation to data mining

Session 2015-2019 Page 10

methods, even as in a standard KDD assignment, supervised techniques cannot be used

3.2 Prominent Sectors the usage of ML

Session 2015-2019 Page 11

Government organizations consisting of public safety and services have a particular

Session 2015-2019 Page 12

Marketing and sales

Session 2015-2019 Page 13

The problem of information classification has numerous programs in a huge variety of

2. Numerical Score: In this example, a numerical score is returned for each

Session 2015-2019 Page 14

• Customer Target Marketing: Since the classification hassle relates characteristic

Session 2015-2019 Page 15

• Supervised Event Detection: In many temporal scenarios, elegance labels can

• Multimedia Data Analysis: It is regularly suitable to carry out classification of

• Biological Data Analysis: Biological information is regularly represented as

• Document Categorization and Filtering: Many packages, which include