Professional Documents
Culture Documents
Minor Report
Minor Report
Seminar Report
On
Submitted To Submitted By
Mr. Loveleen Kumar Harshit Kapoor(16EGJCS073)
Assistant Professor Kapil Kokcha (16EGJCS088)
Lakshya Sharma (16EGJCS100)
We hereby declare that the work, which is being presented in the minor project ,
entitled “Biometric Traits Evaluation” in partial fulfillment for the award of
Degree of “Bachelor of Technology” in Department of Computer Science and
Engineering submitted to the Department of Computer Science &
Engineering, Global Institute of Technology, Rajasthan Technical University is
a record of my own investigations carried under the Guidance of Shri Loveleen
Kumar, Department of Computer Science and Engineering, Global Institute of
Technology.
We have not submitted the matter presented in this project anywhere for the
award of any other Degree.
Harshit Kapoor (16EGJCS073)
Kapil Kokcha (16EGJCS088)
Lakshya Sharma(15EGJCS100)
Global Institute of Technology, Jaipur
Counter Signed by
Loveleen Kumar Assistant Professor
Department of Computer Science and
Engineering Global Institute of
Technology, Jaipur
ii
ACKNOWLEDGEMENT
It is our pleasure to be indebted to various people, who directly or indirectly
contributed in the development of this work and who influenced my thinking,
behavior, and acts during the course of completion. This formal piece of
acknowledgement is an attempt to express the feeling of gratitude towards people
who help us in successful completion of my project.
We would like to express our deepest gratitude to Mr. Loveleen Kumar, our
project guide for his guidance, precious time and necessary advices. He was
always there with his competent support and valuable suggestion throughout the
development phase of the project.
We would also like to thanks the supporting staff of Department of CSE (GIT)
and our parents who always supported us at every step, guided us, inspired us,
and provided us all facilities so that we can achieve our goals.
Harshit Kapoor
(16EGJCS073)
Kapil Kokcha (16EGJCS088)
Lakshya Sharma(16EGJCS100)
IV Year,
VII Semester Computer
Science & Engineering
Global Institute of
Technology, Jaipur
iii
ABSTRACT
“Biometric Traits Evaluation” was built with an objective to study and identify the
best algorithm to work on with the up growing industry 4.0, which helps us to work
efficiently and accurately. The project is built with 4 different algorithms (1 inbuilt, 3
studied). The complete project inbuilt various technologies like machine learning,
python, etc. The users are also provided with other services in order to help them
choose algorithm for their work. The user can compare the algorithms according to
his need on already present data set or using his own personal data set. The system is
built using different machine learning algorithm which makes the system highly
reliable. According to their requirements user can compare and check the algorithms.
The algorithms are compared on the basis of their accuracy and various other factors.
Not only on faces, we can work on different Biometrics like fingerprints or eye retina.
In all it can be concluded that this comparison and work on different algorithms can
help users to save much time, money and efforts, also provide a feeling of secure and
satisfaction.
The Machine Learning field evolved from the large area of Artificial Intelligence,
which pursuits to mimic smart skills of people by means of machines. In the sphere of
Machine Learning one considers the crucial question of the way to make machines
capable of “study”. Learning on this context is thought as inductive inference, in
which one observes examples that represent incomplete records approximately a few
“statistical phenomenon”. In unsupervised studying one generally attempts to discover
hidden regularities (e.G. Clusters) or to hit upon anomalies within the records (as an
instance some unusual gadget feature or a community intrusion). In supervised
gaining knowledge of, there is a label related to each instance. It is meant to be the
solution to a query approximately the instance. If the label is discrete, then the
challenge is known as class hassle – in any other case, for real valued labels we
communicate of a regression problem. Based on these examples (consisting of the
labels), one is in particular interested to expect the solution for different instances
iv
before they are explicitly discovered. Hence, mastering isn't always simplest a
question of remembering however also of generalization to unseen instances.
Machine Learning research has been extremely energetic the last few years. The result
is a huge variety of very correct and efficient algorithms that are quite easy to use for
a practitioner. It seems worthwhile and nearly mandatory for (laptop) scientist and
engineers to learn the way and where Machine Learning can help to automate duties
or offer predictions where human beings have problems to recognise huge amounts of
facts. The long listing of examples wherein Machine Learning techniques have been
successfully carried out consists of: Text category and categorization (as an example
junk mail filtering), network intrusion detection, Bioinformatics (e.G. Most cancers
tissue classification, gene locating, tracking of electric appliances, optimization of
tough disk caching strategies, high-energy physics particle class, reputation of hand
writing, natural scene evaluation and so forth. Some different algorithms encompass
regression algorithms (e.g. Ridge regression, regression bushes), unsupervised
studying algorithms (along with clustering, precept aspect analysis), reinforcement
getting to know, on line studying algorithms or model-choice issues. Some of those
strategies extend the applicability of Machine Learning algorithms extensively and
would every require an advent for them self.
v
TABLE OF CONTENTS
TOPICS PAGE
NO.
ACKNOWLEDGEMENT ii
CANDIDATE DECLARATION iii
ABSTRACT iv
LIST OF CONTENT vi
LIST OF FIGURES vii
CHAPTER 1: INTRODUCTION 1
1.1 Differences between Data Mining,
Machine Learning and Deep Learning? 2
1.2 Need of Machine Learning 3
1.3 Goals of Machine Learning 3
1.4 Why the goals of ML are important or desirable 4
1.5 Uses of Machine Learning 4
CHAPTER 2: MACHINE LEARNING WORKFLOW 6
2.1 Data Processing 7
2.2 Training Sets creation 8
2.3 Machine Learning Algorithms testing, evaluation and selection 8
2.4 Deployment and A/B testing 8
CHAPTER 3: APPLICATIONS OF MACHINE LEARNING 10
3.1 Uses of machine learning 10
3.2 Prominent Sectors the usage of ML 11
CHAPTER 4: METHODS OF CLASSIFICATION 13
4.1 Common Techniques in Data Classification 16
CHAPTER 5: CLASSIFIERS AND MODELS 19
5.1 Decision Tree based methods 19
5.2 Linear regression based methods 21
vi
5.3 Neural Network 22
5.4 Bayesian Network 23
5.5 Support Vector Machine 24
5.6 Nearest Neighbor 25
vii
List of figure
viii
MODELS AND CLASSIFIERS: MACHINE LEARNING GIT DCSE
CHAPTER I
INTRODUCTION
Because of new computing technologies, machine learning today is not like machine
learning of the past. It was born from pattern recognition and the theory that
computers can learn without being programmed to perform specific tasks; researchers
interested in artificial intelligence wanted to see if computers could learn from data.
The iterative aspect of machine learning is important because as models are exposed
to new data, they are able to independently adapt. They learn from previous
computations to produce reliable, repeatable decisions and results. It’s a science that’s
not new – but one that has gained fresh momentum.
While many machine learning algorithms have been around for a long time, the ability
to automatically apply complex mathematical calculations to big data – over and over,
faster and faster – is a recent development. Here are a few widely publicized examples
of machine learning applications you may be familiar with:
• The heavily hyped, self-driving Google car? The essence of machine learning.
• Online recommendation offers such as those from Amazon and Netflix? Machine
learning applications for everyday life.
• Knowing what customers are saying about you on Twitter? Machine learning
combined with linguistic rule creation.
• Fraud detection? One of the more obvious, important uses in our world today.
Machine Learning is an idea to examine from examples and enjoy, without being
explicitly programmed. Instead of writing code, you feed data to the generic
algorithm, and it constructs good judgment primarily based at the facts given.
Although all of these methods have the same goal – to extract insights, patterns and
relationships that can be used to make decisions – they have different approaches and
abilities.
Data Mining
Machine Learning
The main difference with machine learning is that just like statistical models, the goal
is to understand the structure of the data – fit theoretical distributions to the data that
are well understood. So, with statistical models there is a theory behind the model that
is mathematically proven, but this requires that data meets certain strong assumptions
too. Machine learning has developed based on the ability to use computers to probe
the data for structure, even if we do not have a theory of what that structure looks like.
The test for a machine learning model is a validation error on new data, not a
theoretical test that proves a null hypothesis. Because machine learning often uses an
iterative approach to learn from data, the learning can be easily automated. Passes are
run through the data until a robust pattern is found.
Deep learning
Deep learning combines advances in computing power and special types of neural
networks to learn complicated patterns in large amounts of data. Deep learning
techniques are currently state of the art for identifying objects in images and words in
sounds. Researchers are now looking to apply these successes in pattern recognition to
more complex tasks such as automatic language translation, medical diagnoses and
numerous other important social and business problems.
So system learning changed into evolved as a new functionality for computer systems.
And now gadget studying is found in so many divisions of generation, that we don’t
even realize it at the same time as the usage of it.
The purpose of ML, in simples words, is to apprehend the character of (human and
other kinds of) learning, and to shape getting to know functionality in computers. To
be more precise, there are 3 capabilities of the desires of ML.
• To make the computer systems smarter, extra wise. The more direct goal in this
factor is to increase systems (applications) for precise realistic gaining knowledge
of tasks in application domain names.
• To cultivate computational models of human getting to know method and put into
effect laptop simulations.
Machine studying has several practical uses that force the kind of actual enterprise
outcomes – consisting of money and time financial savings – which have the potential
to dramatically have an effect on the future of your organization. Researches or
surveys in particular show amazing impact occurring within the purchaser care
industry, whereby device studying is letting people to get things executed greater
speedy and efficaciously.
Machine gaining knowledge of has been used for photograph, video, and textual
content reputation, as well as serving because the energy behind recommendation
engines. Today, it's being used to strengthen cyber security, ensure public safety, and
improve medical outcomes. It can also assist increase customer service and make cars
safer.
To better understand the uses of machine studying, keep in mind a number of the
instances wherein gadget studying is carried out: the self-riding Google vehicle, cyber
fraud detection, online recommendation engines—like pal guidelines on Facebook,
Netflix showcasing the movies and indicates you might like, and “more gadgets to
recall” and “get yourself a touch something” on Amazon—are all examples of carried
out gadget learning. All those examples without a doubt country the essential position
system getting to know has started to take in these days data-wealthy international.
Machines can help in filtering beneficial portions of information that assist in
principal tendencies, and we're already seeing how this generation is being carried out
in a huge range of industries.
CHAPTER II
Machine gaining knowledge of uses algorithms that get educated from records to help
make greater decisions; however, it is not always apparent that's the excellent gadget
learning algorithm going to be for a positive hassle. Luckily, records which include
variable importance and model evaluation tools can help us pick out which machine
studying strategies to use.
• Supervised machine learning algorithms can apply what has been learned in the
past to new data using labeled examples to predict future events. Starting from the
analysis of a known training dataset, the learning algorithm produces an inferred
function to make predictions about the output values. The system is able to
provide targets for any new input after sufficient training. The learning algorithm
can also compare its output with the correct, intended output and find errors in
order to modify the model accordingly.
because the highquality and extent of statistics that you collect will at once decide
how true your predictive model may be. Data instruction, in which we load our data
into a appropriate place and prepare it for use in our machine studying schooling. This
is also an awesome time to do any applicable visualization of your facts, to help you
see if there are any associated relationships among one of a kind variables you may
take gain of, in addition to display you if there are any facts imbalances. We’ll also
need to divide the information in elements. The first element, utilized in training our
model, may be the bulk of the dataset. The 2nd component could be used for
calculating our educated version’s overall performance.
The version is mostly in shape on a schooling dataset that is a set of examples used to
match the parameters of the model. The version is educated at the training dataset
using a supervised mastering method. In exercise, the education dataset frequently
contain pairs of an input vector (or scalar) and the corresponding output vector (or
scalar), that's typically denoted as the target (or label). The present day model is
administered with the training dataset and produces a end result, which is then in
comparison with the target, for each enter vector inside the training dataset. Based on
the outcome of the assessment and the specific getting to know algorithm getting
used, the parameters of the version are adjusted. The version becoming can
encompass both variable selection and parameter estimation.
The next step in our workflow is choosing a model. There are several models that
researchers and facts scientists have created over the years. Some are very well
desirable for photo facts, others for sequences (like text, or music), some for
numerical information, and others for textual content-primarily based data. It is most
effective at this point that the statistics scientists will start testing algorithms to build
special fashions, to evaluate them and to satisfactory-music the hyper-parameters so
that the great version(s) are selected. In applied machine learning, person algorithms
need to be swapped in and out relying on which performs quality for the hassle and
the dataset. Therefore, we are able to attention on belief and sensible blessings over
math and principle.
Once schooling is whole, it’s time to look if the model is of any properly, the use of
evaluation. This is wherein that dataset that we set aside earlier comes into play.
Evaluation allows us to check our version in opposition to facts that has never been
used for training. This metric permits us to see how the model may carry out towards
statistics that it has no longer yet visible. This is meant to be representative of how the
model might carry out within the actual global. A good rule of thumb you'll use for a
schooling-assessment split somewhere on the order of 80/20 or 70/30. Much of that is
motivated via the dimensions of the unique supply dataset. If you've got plenty of
facts, perhaps you don’t want as big of a fragment for the assessment dataset.
One has no concept what algorithm will work fine on a problem before start. Even
professional data scientists cannot inform you. This hassle isn't restricted to the
selection of system learning algorithms. You cannot recognize what records
transforms and what features within the information that if uncovered could
satisfactory present the shape of the trouble to the algorithms.
One may also have some thoughts. One may additionally have some favored
techniques. But how does one recognize that the techniques that got you consequences
remaining time will get you true results this time? How does one recognize that the
techniques are transferable from one problem to every other? The solution to these
complete questions relies upon on the problem and dataset. The selection of set of
rules differs from one dataset to some other and sort of problem.
CHAPTER III
Machine learning and statistics mining regularly appoint the equal methods and
overlap significantly, however while gadget mastering makes a specialty of
prediction, based totally on recognized houses found out from the education facts,
data mining focuses on the invention of (formerly) unknown properties in the
statistics (that is the evaluation step of expertise discovery in databases). Data mining
makes use of many gadget studying strategies, but with special goals; then again,
gadget gaining knowledge of additionally employs statistics mining methods as
Relation to optimization
Machine getting to know also has intimate ties to optimization: many learning issues
are formulated as minimization of some loss characteristic on a training set of
examples. Loss features specific the discrepancy among the predictions of the model
being trained and the actual hassle instances (as an instance, in type, one desires to
assign a label to times, and fashions are trained to properly expect the pre-assigned
labels of a fixed of examples). The distinction among the two fields arises from the
goal of generalization: even as optimization algorithms can limit the loss on a
schooling set, gadget studying is involved with minimizing the loss on unseen
samples.
Relation to statistics
Machine studying and facts are intently associated fields. According to Michael I.
Jordan, the thoughts of gadget getting to know, from methodological ideas to
theoretical equipment, have had a long pre-records in records. He also cautioned the
time period facts science as a placeholder to name the overall area.
Most industries operating with big volumes of facts have identified the value of
machine studying generation. By accumulating insights from this data often in actual
time companies are able to work extra efficaciously or earn an advantage over
contenders. Most industries working with large amounts of data have recognized the
value of machine learning technology. By gleaning insights from this data – often in
real time – organizations are able to work more efficiently or gain an advantage over
competitors.
Financial services
Banks and different companies in the monetary industry use system gaining
knowledge of era for two key purposes: to understand important insights in records,
and save you fraud. The insights can recognize investment possibilities, or help
buyers realize when to trade. Data mining can also identify clients with high-danger
profiles, or use cyber surveillance to pinpoint caution symptoms of fraud.
Government
Health care
Machine mastering is a fast-growing fashion within the fitness care enterprise, way to
the arrival of wearable gadgets and sensors which could use records to assess a
patient's health in real time. The generation also can assist medical experts analyze
statistics to pick out trends or pink flags which could result in higher-quality
diagnoses and remedy.
Retail
Websites suggesting objects you might like primarily based on in advance purchases
are using machine mastering to investigate your shopping records. Retailers depend
on gadget mastering to capture statistics, analyze it and use it to personalize a
shopping revel in, implement a advertising marketing campaign, rate optimization,
merchandise supply planning, and for client insights.
Transportation
Analyzing records to recognize styles and traits is prime to the transportation industry,
which is based on making routes greater efficient and predicting capacity issues to
boom profitability. The statistics evaluation and modelling elements of device gaining
knowledge of are essential tools to delivery organizations, public transportation and
other transportation groups.
Websites recommending items you might like based on previous purchases are using
machine learning to analyze your buying history – and promote other items you'd be
interested in. This ability to capture data, analyze it and use it to personalize a
shopping experience (or implement a marketing campaign) is the future of retail.
CHAPTER IV
METHODS OF CLASSIFICATION
Given a fixed of training statistics points at the side of related training labels, decide
the magnificence label for an unlabeled test example.
Numerous versions of this problem may be defined over distinctive settings. Excellent
overviews on statistics classification may be located in [39, 50, 63, and 85].
Classification algorithms usually contain two phases:
• Training Phase: In this phase, a version is made out of the education instances.
• Testing Phase: In this phase, the version is used to assign a label to an unlabeled
check instance.
In a few cases, consisting of lazy getting to know, the training segment is neglected
absolutely, and the classification is achieved immediately from the relationship of the
education times to the check example. Instance-primarily based methods including the
nearest neighbor classifiers are examples of one of these scenario. Even in such cases,
a pre-processing segment along with a nearest neighbor index construction may be
completed so as to ensure efficiency at some stage in the checking out phase. The
output of a classification set of rules can be provided for a test example in certainly
one of approaches:
1. Discrete Label: In this situation, a label is back for the take a look at example.
Note that the numerical rating can be transformed to a discrete label for a check
example, by way of choosing the magnificence with the highest rating for that test
instance. The advantage of a numerical rating is that it now becomes possible to
evaluate the relative propensity of different test instances to belong to a selected
elegance of importance, and rank them if wished. Such methods are used regularly in
uncommon class detection issues, in which the original class distribution is
particularly imbalanced, and the invention of a few instructions is more valuable than
others. The classification hassle hence segments the unseen test times into businesses,
as defined by the class label. While the segmentation of examples into businesses is
also completed by way of clustering, there's a key difference between the two
problems. In the case of clustering, the segmentation is completed the use of
similarities among the function variables, without a prior knowledge of the structure
of the companies. In the case of classification, the segmentation is completed on the
basis of a schooling data set, which encodes information approximately the structure
of the corporations within the shape of a target variable. Thus, at the same time as the
segmentations of the records are usually associated with notions of similarity, as in
clustering, significant deviations from the similarity-based segmentation can be
completed in sensible settings. As a result, the classification hassle is called
supervised getting to know, simply as clustering is known as unsupervised studying.
The supervision procedure of ten provides
significant application specific software, due to the fact the class labels may
additionally constitute essential residences of interest. Some common utility domain
names in which the classification trouble arises, are as follows:
In such cases, feature variables describing the patron can be used to expect their
buying pastimes on the basis of previous schooling examples. The goal variable may
additionally encode the buying interest of the customer.
• Medical Disease Diagnosis: In recent years, the use of facts mining techniques
in clinical era has won increasing traction. The capabilities can be extracted from the
medical records, and the magnificence labels correspond to whether or not or now not
an affected person may additionally select up a disorder within the future. In those
cases, its miles appropriate to make sickness predictions with using such facts.
The first phase of surely all classification algorithms is that of function choice. In
most facts mining eventualities, a huge variety of functions are amassed by way of
folks who are frequently now not domain specialists. Clearly, the irrelevant
capabilities can also frequently bring about negative modeling, considering the fact
that they are not properly related to the class label. In reality, such features will
typically get worse the classification accuracy because of over fitting, while the
education facts set is small and such capabilities are allowed to be part of the
schooling version. For instance, do not forget a clinical example where the features
from the blood work of different patients are used to predict a specific ailment.
Clearly, a feature including the Cholesterol degree is predictive of coronary heart
sickness, while a feature1 which include PSA stage is not predictive of coronary heart
disease. However, if a small training statistics set is used, the PSA degree may
additionally have freak correlations with coronary heart sickness because of random
variations. While the impact of a single variable may be small, the cumulative impact
of many inappropriate function experiment be significant. This will bring about a
training version that generalizes poorly to unseen check times.
Therefore, it is important to use the ideal functions all through the education system.
There are broad kinds of function choice techniques:
2. Wrapper Models: In those cases, the feature choice method is embedded right
into a classification algorithm, with a view to make the feature choice technique
touchy to the classification set of rules. This method acknowledges the truth that
specific algorithms may go higher with distinct features.
In order to perform function choice with filter models, some of exclusive measures are
used so that it will quantify the relevance of a characteristic to the classification
method. Typically, those measures compute the imbalance of the feature values over
distinctive stages of the characteristic, which may either be discrete or numerical.
Some examples are as follows:
Probabilistic methods
Probabilistic methods are the most essential amongst all records classification
methods.
Probabilistic classification algorithms use statistical inference to find the high-quality
magnificence for a given example. In addition to definitely assigning the great class
like other classification algorithms, probabilistic classification algorithms will output
a corresponding posterior probability of the test instance being a member of each of
the viable lessons. The posterior possibility is defined because the chance after
observing the specific characteristics of the check example. On the other hand, the
previous probability is really the fraction of schooling facts belonging to every precise
elegance, with no knowledge of the test example. After acquiring the posterior
chances, we use decision theory to determine elegance club for each new instance.
Basically, there are methods in which we can estimate the posterior possibilities. In
the first case, the posterior possibility of a selected elegance is expected by figuring
out the classconditional possibility and the prior elegance one by one and then
applying Bayes’ theorem to find the parameters. The maximum widely known among
those is the Bayes classifier, which is known as age native model.
CHAPTER V
The exact a part of Tree is that its miles very bendy in terms of the statistics sort of
enter and output variables which may be categorical, binary and numeric price. The
stage of selection nodes additionally indicate the degree of influences of various input
variables. The predicament is every decision boundary at each cut up factor is a
concrete binary selection. Also the choice standards only keep in mind one input
attributes at a time but not a aggregate of multiple input variables. Another weak spot
of Tree is that once learned it cannot be up to date incrementally. When new training
facts arrives, you need to throw away the old tree and retrain every information from
scratch.
However, Tree when mixed with Ensemble strategies (Random Forest, Boosting
Trees) addresses plenty of the limitations referred to above. For example, Gradient
Boosting Decision Tree continually beat the overall performance of other ML models
in many troubles and is one of the maximum famous approach in recent times.
The basic assumption is that the output variable (a numeric cost) can be expressed as a
linear aggregate (weighted sum) of a fixed of enter variable (which is also numeric
cost). y = w1x1 + w2x2 + w3x3....
The complete objective of the education section is to learn the weights w1, w2 ... By
using minimizing the mistake feature lost(y, w1x1 + w2x2 + ...). Gradient descent is
the classical approach of solving this trouble with the general idea of adjusting w1, w2
... Alongside the course of the most gradient of the loss feature.
The input variable is required to be numeric. For binary variable, this could be
represented as 0, 1. For specific variable, every feasible price can be represented as a
separate binary variable (and consequently 0, 1). For the output, if it is a binary
variable (0, 1) then a log it characteristic is used to transform the range of -infinity to
+infinity into 0 to at least one. This is called logistic regression and a exceptional loss
characteristic (primarily based on most likelihood) is used.
To keep away from over fitting, regularization method (L1 and L2) is used to penalize
big value of w1, w2 ... L1 is with the aid of adding absolutely the fee of w1 into the
loss characteristic whilst L2 is through including the square of w1 into the loss
feature. L1 has the belongings that it'll penalize redundant capabilities or beside the
point characteristic greater (with very small weight) and is a good tool to pick out
especially influential capabilities.
The power of linear model is that it has very excessive performance in each scoring
and mastering. The Stochastic gradient descent-primarily based learning algorithm is
notably scalable and might handle incremental mastering.
The weak point of linear version is linear assumption of input functions, which is
frequently fake. Therefore, an essential function engineering attempt is required to
convert every input function, which normally worried domain expert. Another
commonplace manner is to throw special transformation functions 1/x, x^2, log(x)
within the wish that considered one of them will have a linear relationship with the
output. Linearity may be checked with the aid of observing whether the residual (y -
predicted) is typically allotted or now not (the use of the Plot with the Gaussian
distribution).
Notice that Neural Network expect binary input which means we need to transform
categorical input into multiple binary variable. For numeric input variable, we can
transform that into binary encoded 101010 string. Categorical and numeric output can
be transformed in a similar way.
It is essentially a dependency graph where each node represents a binary variable and
every part (directional) represents the dependency courting. If Node A and Node B
has an part to Node C. This manner the in all likelihood of C is genuine relies upon on
specific combinations of the Boolean fee of A and B. Node C can factor to Node D
and now Node D relies upon on Node A and Node B as well.
The mastering is about locating at each node be part of-chance distribution of all
incoming edges. This is accomplished via counting the determined values of A, B and
C after which update the joint probability distribution table at Node C. Once we've the
opportunity distribution table at every node, then we can compute the probability of
any hidden node (output variable) from the determined nodes (enter variables) with
the aid of the usage of the Bayes rule.
Support Vector Machine takes numeric input and binary output. It is based on finding
a linear plane with maximum margin to separate two class of output. Categorical input
can be turned into numeric input as before and categorical output can be modeled as
multiple binary output.
With a different lost function, SVM can also do regression (called SVR). I haven't
used this myself so I can't talk much. The strength of SVM is it can handle large
number of dimensions. With the kernel function, it can handle non-linear relationship
as well.
We are not mastering a model in any respect. The concept is to find K similar
statistics point from the training set and use them to interpolate the output cost, which
is either most of the people value for express output, or average (or weighted
common) for numeric output. K is a tunable parameter which desires to be move-
tested to choose the first-class price.
Nearest Neighbor require the definition of a distance function that's used to discover
the nearest neighbor. For numeric input, the commonplace practice is to normalize
them by way of minus the mean and divided through the usual deviation. Euclidean
distance is typically used while the input are impartial, otherwise distance (which
account for correlation among pairs of input functions) have to be used rather. For
binary attributes, Jacquard distance can be used. The power of K nearest neighbor is
its simplicity as no version desires to be taught. Incremental getting to know is
automatic when more facts arrives (and vintage records can be deleted as properly).
Data, but, wishes to be prepared in a distance-conscious tree such that locating the
nearest neighbor is O (log N) in preference to O (N). On the other hand, the weak spot
of KNN is it doesn't deal with excessive number of dimensions.
CHAPTER VI
Problem is very clear we have various demands for the best possible solution to
implement biometric traits. We are just implementing a fix algorithm for all possible
problems without comparing and knowing the accuracy of a fixed algorithm for that
particular problem. A fixed algorithm cannot give the best possible result and
accuracy for every problem. We should compare and have the proper knowledge of
the problem and its best possible solution. In our day to day life we come across
various type of technologies from identifying a person from their fingerprint to their
eye retina. And with growing technology the need of correct and efficient algorithm is
faced. And that’s how we got the idea of studying different algorithms and getting the
best out of it for different uses.
is widely adopted due to its contactless and non-invasive process. Recently, it has
also become popular as a commercial identification and marketing tool. Other
applications include advanced human-computer interaction, video surveillance,
automatic indexing of images, and video database, among others.
Face recognition began as early as 1977 with the first automated system being
introduced By Kanade using a feature vector of human faces. In 1983, Sirovich
and Kirby introduced the principal component analysis (PCA) for feature
extraction. Using PCA, Turk and Pentland Eigenface was developed in 1991
and is considered a major milestone in technology [3]. Local binary pattern
analysis for texturerecognition was introduced in 1994 and is improved upon
for facial recognition later by incorporating Histograms (LBPH). In 1996
Fisherface was developed using Linear discriminant analysis (LDA) for
dimensional reduction and can identify faces in different illumination
conditions, which was an issue
in Eigenface method. Viola and Jones introduced a face detection technique
using HAAR cascades and ADABoost. In 2007, A face recognition technique
was developed by Naruniec and Skarbek using Gabor Jets that are similar to
mammalian eyes. In This project, HAAR cascades are used for face detection
and Eigenface, Fisherface are used for face recognition.
Users have very little knowledge about the technologies and algorithms
available for them in the machine learning/deep learning. Moreover if users are
aware about any of the algorithm then its proper usage techniques need also be
clarified. If the model is trained with the incorrect data set or the wrong model
is used for the different purpose then it will never give the proper and required
result to the user. The users may use the algorithms which may under-fit or
over-fit the model, hence consuming more time and power with no guarantee of
the proper and required result.
Our system will help the users to first understand the available algorithms for
their requirements. Users do not have to get panic for the successful
implantation of their model within give time. According to the users need the
correct data set will be collected and if users have their own data set to work on
then the users will be provided with best algorithm to work on their data set.
The model will be trained with the best possible algorithm and the cases of data
over-fitting and under-fitting will be avoided with using the correct data set
with the correct algorithm.
We are providing the users with 3 different algorithms namely Support vector
machine (SVM), Principle component analysis (PCA), Linear Discriminant
Analysis (LDA). They all have their own accuracy and ways to train and predict
the model. We have used Eigen face, Fisher face hence providing the fast and
accurate model with less consumption of memory.
current Industry 4.0. After getting the results user will have the choice to pick the
best algorithm for their project instead of blindly working on any algorithm
without
knowing its accuracy and precision. The model is trained and compared with high
precision using different geometric based algorithms. The user can compare the
accuracy on their own dataset or already available data set. It is easy to compare
for the user and it will save them time and money as well as user can work on
other
required things instead of working and testing all algorithms.
At the initial stage the project is comparing and showing the results based on face
recognition system only. As a future scope for the project the project can be
implemented for different biometrics like finger print or eye retina. We do not
have to start the project from scratch to implement and train models for other
biometrics, instead small changes and retrain of the model with the new data set
according to the users requirement will give the user a clear idea about the model.
The purpose of this report is to inform the reader that it is helpful to both, the reader
and the writer. Report being the logically organised will be able to deliver the whole
content along with the concept and purpose of the project. Its aim is to deliver or
inform the features, agenda and vision along with the flow of project successfully. It
has been tried to explain briefly, containing enough information to distinguish it from
the other and existing projects.
CHAPTER VII
7.1 Introduction
The following document is a report on the mini project for Robotic visual perception
and autonomy. It involved building a system for face detection and face recognition
using several classifiers available in the open computer vision library(OpenCV).
Face recognition is a non-invasive identification system and faster than other
systems since multiple faces can be analysed at the same time. The difference
between face detection and identification is, face detection is to identify a face from
an image and locate the face. Face recognition is making the decision” whose face is
it?”, using an image database. In this project both are accomplished using different
techniques and are described below. The report begins with a brief history of face
recognition. This is followed by the explanation of HAAR-cascades, Eigen face,
Fisher face and Local binary pattern histogram (LBPH) algorithms. Next, the
methodology and the results of the project are described. A discussion regarding the
challenges and the resolutions are described. Finally, a conclusion is provided on the
pros and cons of each algorithm and possible implementations.
To analyse an image using Haar cascades, a scale is selected smaller than the target
image. It is then placed on the image, and the average of the values of pixels in each
section is taken. If the difference between two values pass a given threshold, it is
considered a match. Face detection on a human face is performed by matching a
combination of different Haar-like-features. For example, forehead, eyebrows and
eyes contrast as well as the nose with eyes as shown below in figure A single
classifier is not accurate enough. Several classifiers are combined as to provide an
accurate face detection system as shown in the block diagram below in figure 3.
only on misses to get better performance. The cascade is scaled by 1.25 and re-
iterated in order to find different sized faces. Running the cascade on an image using
conventional loops takes a large amount of computing power and time. Viola Jones
[7] used a summed area table (an integral image) to compute the matches fast. First
developed in 1984 [11], it became popular after 2001 when Viola Jones
implemented Haar-cascades for face detection. Using an integral image enables
matching features with a single pass over the image.
7.3Face Recognition
Eigen face is based on PCA that classify images to extract features using a set of
images. It is important that the images are in the same lighting condition and the
eyes match in each image. Also, images used in this method must contain the same
number of pixels and in grayscale. For this example, consider an image with n x n
pixels as shown in figure 4. Each raw is concatenated to create a vector, resulting a
1×n2 matrix. All the images in the dataset are stored in a single matrix resulting a
matrix with columns corresponding the number of images. The matrix is averaged
(normalised) to get an average human face. By subtracting the average face from
each image vector unique features to each face are computed. In the resulting
matrix, each column is a representation of the difference each face has to the
average human face. A simplified illustration can be seen in figure 4.
The next step is computing the covariance matrix from the result. To obtain the
Eigen vectors from the data, Eigen analysis is performed using principal component
analysis. From the result, where covariance matrix is diagonal, where it has the
highest variance is considered the 1st Eigen vector. 2nd Eigen vector is the direction
of the next highest variance, and it is in 90 degrees to the 1 st vector. 3rd will be the
next highest variation, and so on. Each column is considered an image and
visualised, resembles a face and called Eigenfaces. When a face is required to be
recognised, the image is imported, resized to match the same dimensions of the test
data as mentioned above. By projecting extracted features on to each of the
Eigenfaces, weights can be calculated. These weights correspond to the similarity of
the features extracted from the different image sets in the dataset to the features
extracted from the input image. The input image can be identified as a face by
comparing with the whole dataset. By comparing with each subset, the image can be
identified as to which person it belongs to. By applying a threshold detection and
identification can be controlled to eliminate false detection and recognition. PCA is
sensitive to large numbers and assumes that the subspace is linear. If the same face
is analysed under different lighting conditions, it will mix the values when
distribution is calculated and cannot be effectively classified. This makes to
different lighting conditions poses a problem in matching the features as they can
change dramatically.
7.3.2 Fisherface
Fisherface technique builds upon the Eigenface and is based on LDA derived from
Ronald Fishers’ linear discriminant technique used for pattern recognition.
However, it uses labels for classes as well as data point information [6]. When
reducing dimensions, PCA looks at the greatest variance, while LDA, using labels,
looks at an interesting dimension such that, when you project to that dimension you
maximise the difference between the mean of the classes normalised by their
variance [6]. LDA maximises the ratio of the between-class scatter and within-class
scatter matrices. Due to this, different lighting conditions in images has a limited
effect on the classification process using LDA technique. Eigenface maximises the
variations while Fisherface maximises the mean distance between and different
classes and minimises variation within classes. This enables LDA to differentiate
between feature classes better than PCA and can be observed in figure 5 [12].
Furthermore, it takes less amount of space and is the fastest algorithm in this
project. Because of these PCA is more suitable for representation of a set of data
while LDA is suitable for classification.
Figure 5: The first component of PCA and LDA. Classes in PCA looks more
mixed than of LDA
Local binary patterns were proposed as classifiers in computer vision and in 1990
By Li Wang [4]. The combination of LBP with histogram oriented gradients was
introduced in 2009 that increased its performance in certain datasets [5]. For feature
encoding, the image is divided into cells (4 x 4 pixels). Using a clockwise or
counter-clockwise direction surrounding pixel values are compared with the central
as shown in figure 6. The value of intensity or luminosity of each neighbour is
compared with the centre pixel. Depending if the difference is higher or lower than
0, a 1 or a 0 is assigned to the location. The result provides an 8-bit value to the cell.
The advantage of this technique is even if the luminosity of the image
is changed as in figure 7, the result is the same as before. Histograms are used in
larger cells to find the frequency of occurrences of values making process faster. By
analysing the results in the cell, edges can be detected as the values change. By
computing the values of all cells and concatenating the histograms, feature vectors
can be obtained. Images can be classified by processing with an ID attached. Input
images are classified using the same process and compared with the dataset and
distance is obtained. By setting up a threshold, it can be identified if it is a known or
unknown face. Eigenface and Fisherface compute the dominant features of the
whole training set while LBPH analyse them individually.
7.4 Methodology
Below are the methodology and descriptions of the applications used for data
gathering, face detection, training and face recognition. The project was coded in
Python using a mixture of IDLE and PYCharm IDEs.
7.4.1Face Detection
First stage was creating a face detection system using Haar-cascades. Although,
training is required for creating new Haar-cascades, OpenCV has a robust set of
Haar-cascades that was used for the project. Using face-cascades alone caused
random objects to be identified and eye cascades were incorporated to obtain stable
face detection. The flowchart of the detection system can be seen in figure 8. Face
and eye
classifier objects are created using classifier class in OpenCV through the
cv2.CascadeClassifier() and loading the respective XML files. A camera object is
created using the cv2.VideoCapture() to capture images. By using the
CascadeClassifier.detectMultiScale() object of various sizes are matched and
location is returned. Using the location data, the face is cropped for further
verification. Eye cascade is used to verify there are two eyes in the cropped face. If
satisfied a marker is placed around the face to illustrate a face is detected in the
location.
For this project three algorithms are implemented independently. These are
Eigenface, Fisherface and Linear binary pattern histograms respectively. All three
can be implemented using OpenCV libraries. There are three stages for the face
recognition as follows:
Application starts with a request for a name to be entered to be stored with the ID in
a text file. The face detection system starts the first half. However, before the
capturing begins, the application check for the brightness levels and will capture
only if the face is well illuminated. Furthermore, after the face is detected, the
position of the eyes are analysed. If the head is tilted, the application automatically
corrects the orientation. These two additions were made considering the
requirements for Eigenface algorithm. The Image is then cropped and saved using
the ID as a filename to be identified later. A loop runs this program until 50 viable
images are collected from the person. This application made data collection
efficient.
cv2.face. createEigenFaceRecognizer ()
1. Takes in the number of components for the PCA for crating Eigenfaces.
OpenCV documentationmentions 80 can provide satisfactory reconstruction
capabilities.
2. Takes in the threshold in recognising faces. If the distance to the likeliest
Eigenface is above thisthreshold, the function will return a -1, that can be
used state the face is unrecognisable cv2.face.
createFisherfaceRecognizer ()
1. The first argument is the number of components for the LDA for the creation
of Fisherfaces.OpenCV mentions it to be kept 0 if uncertain.
2. Similar to Eigenface threshold. -1 if the threshold is passed.
cv2.face. createLBPHFaceRecognizer ()
1. The radius from the centre pixel to build the local binary pattern.
2. The Number of sample points to build the pattern. Having a considerable
number will slow downthe computer.
3. The Number of Cells to be created in X axis.
4. The number of cells to be created in Y axis.
5. A threshold value similar to Eigenface and Fisherface. if the threshold is
passed the object willreturn -1
Recogniser objects are created and images are imported, resized, converted into
numpy arrays and stored in a vector. The ID of the image is gathered from splitting
the file name, and stored in another vector. By using
FaceRecognizer.train(NumpyImage, ID) all three of the objects are trained. It
must be noted that resizing the images were required only for Eigenface and
Fisherface, not for LBPH. Next, the configuration model is saved as a XML file
using FaceRecognizer.save(FileName). In this project, all three are trained and
saved through one application for convenience. The flow chart for the trainer is
shown in figure 10.
Face recogniser object is created using the desired parameters. Face detector is used
to detect faces in the image, cropped and transferred to be recognised. This is done
using the same technique used for the image capture application. For each face
detected, a prediction is made using FaceRecognizer.predict() which return the
ID of the class and confidence. The process is same for all algorithms and if the
confidence his higher than the set threshold, ID is -1. Finally, names from the text
file with IDs are used to display the name and confidence on the screen. If the ID is
-1, the application will print unknown face without the confidence level. The flow
chart for the application is shown in figure 11.
CHAPTER VII
CONCLUSION
In the close to destiny, the sector is set to witness brilliant increase in Smart Apps,
Virtual Assistants, and significant use of Artificial Intelligence. The cellular
marketplace will expand via using device gaining knowledge of, and we are able to
quickly enter the generation of self-driven vehicles (they've already been launched for
testing and trials). Machine Learning is already an exceedingly powerful tool which
has been solving complicated problems. Although new Machine Learning gear would
pop up now and then, the talents required to music them and jazz those up could for
all time be in call for. Regarding activity possibilities Machine Learning has a
widespread role to play, there may be no element of lifestyles in which Machine
Learning has now not left its mark. As the quantity of facts proliferates, the need for
Engineers and Scientists has increased and will keep growing. In order to apprehend
and control the subtleties and pitfalls of Machine Learning, personnel might be
required because what appears as nicely tuned, simple machine is capable of leading
you off target out of your desired effects. Companies and industries heavily rely upon
Machine Learning, and so you have an incredible possibility in the field. The demand
for Machine Learning Engineers would keep growing, and you may get in at the
movement. The group including Google, Quora, and Facebook lease those who
recognize gadget studying. There is intensive research taking place in machine getting
to know within the top universities of the world. There isn't any higher limit to the
earnings of machine gaining knowledge of experts within the pinnacle businesses.
Machine Learning is a harbinger of capacity increase of people and financial system.
So a ways now we have simply removed the veneer from the floor. There is a whole
lot greater that Machine Learning has yet to achieve and introduce. There is hardly
ever any application for which Machine Learning can't be used for detection and
prediction. Despite the contradictions in perspectives, its miles assured that during
destiny, the space between demand-supply in Data Science and Machine Learning
Skills may want to only be bridged by way of imparting the body of workers that may
deal with Machine Learning’s intricacies, given the advantages of Machine Learning.
The agencies would plunge into faucet algorithm fashions that could enhance and
enhance their operations and consumer-going through features. The algorithms would
take the commercial enterprise to whole new tiers. We have already seen how era has
replaced human beings in financial market and lots of different areas for the better,
starting off the load of bulky and exertions-intensive paintings from human shoulders.
It, therefore, wouldn’t be wrong to mention that Machine Learning has a shiny destiny
in advance that would assist human beings enter a new modified era. Over time
greater dramatic evolution could convey in extra highquality changes.
REFERENCES
1. Jayesh Bapu Ahire : August 24, 2018 : The artificial Neural Networks
https://www.datasciencecentral.com/profiles/blogs/the-artificial-neural-networks
2. Stephen DeAngelis : November 04, 2014 : Machine learning
https://www.enterrasolutions.com/blog/machine-learning-short-primer/
http://fgiasson.com/blog/index.php/2017/03/10/a-machine-learning-workflow/ 3. Mariane
Davids : Jan 9, 2017 7:00:00 AM: Applications of machine learning
https://blog.robotiq.com/5-applications-of-machine-learning
4. Bernard Marr Feb 19, 2016: A short history of machine Learning
https://www.forbes.com/sites/bernardmarr/2016/02/19/a-short-history-of-machine- learning-every-
manager-should-read/#74ed2c0815e7
5. Narayanan, Arvind (August 24l, 2016). "Language necessarily contains human biases, and so
will machines trained on language corpora". Freedom to Tinker.