You are on page 1of 5

Proceedings of the International Conference on Smart Electronics and Communication (ICOSEC 2020)

IEEE Xplore Part Number: CFP20V90-ART; ISBN: 978-1-7281-5461-9

Automatic Students Attendance Marking System Using Image


Processing And Machine Learning

Vidya Patil Anushka Narayan Vaishnavi Ausekar Anahita Dinesh


School of Computer Engineering Dept. Computer Engineering, Dept. Computer Engineering, Dept. Computer Engineering,
and Technology, MIT- MIT,Pune MIT,Pune MIT Pune
WPU,Pune anushkan1996@gmail.com vaishnaviv.ausekar1998@gmail.com anadinesh@gmail.com
vidya.patil@mitwpu.edu.in

Abstract- To track student attendance, colleges and schools use traditional it takes around 10 minutes for the manual marking of the attendance. This
methods involving manual marking on sheets. These methods are time- nonproductive time can be saved by using an automated attendance
consuming and tedious for teachers. The proposed system automatically marking system. Such automated systems are being developed using
records the student's attendance during lecture hours using facial recognition
machine learning algorithms for accurately marking the students’
technology with image processing instead of the traditional manual methods.
attendance.
The proposed system works in a controlled environment, in which students,
Paper organization is given as: Section II gives related work literature
face images are captured and then upon recognition, their attendance is
automatically marked in an Excel sheet. S tudent's face is detected using the
survey, section III explains the proposed system and section IV describes

Viola-Jones technique, whereas Linear Discriminate Analysis (LDA) along with algorithms used. Section V explains Experimentation and Results, whereas
KNN and S VM is used for face recognition. Experimentation shows that it section VI gives conclusion and future work.
provides better accuracy than the existing PCA and other techniques.

II. Literature Survey


Keywords- Image Processing, Face detection, Face Recognition While studying related work in the field, there is so many recent
works done by many researchers in the area of ‘Automatic attendance

I. Introduction marking systems’. Sawhney et al. [1] proposed a model for marking an
automated attendance of the students in a class by using a face recognition
Maintaining the attendance of the students in schools and colleges is one method with Eigenvalues and Convolutional Neural Network (CNN). Face
of the significant tasks for the teachers. Attendance is an important factor detection and recognition algorithms are proposed to be applied to both the
for testing consistency of the students learning and to maintain discipline. cameras outside as well as inside the classroom. Viola-Jones algorithm is
Traditional methods involve manual attendance marking on sheets . This is a proposed for face detection and Principal Component Analysis (PCA)
very time-consuming, tedious effort on the teacher's part, and this time can technique is proposed for feature selection while creating face image
be used for other important work. To cut down on this time and effort, a training dataset. Actual implementation results are not given in the paper.
system is proposed, which will detect and recognize each student and Sovitkar and Kawathekar [2] used the Viola Jones algorithm for detection
automatically mark his/her attendance in an excel sheet. The system will of the faces. For feature selection, they used algorithms such as LDA, PCA
also help to avoid attendance proxies. and hybrid approach using both of these with SVM as a classifier. They
In the era of the technologically assisted world today, image created dataset using different facial expressions, poses and lighting
processing has a wide range of applications such as video surveillance, conditions. They achieved maximum success rates as 90.1% for PCA,
behavior analysis, crowd analysis, teleconferencing as well as 92.3% for LDA and 95.6% with Hybrid approach. Average rate of
authentication using person recognition with different biometrics [16]. recognition with the combination of all algorithms showed is 95%. There
Biometric recognition systems are based on the identification and is decrease in the recognition rate as persons to be identified increases.
recognition of unique features related to the person to be recognized. These Matilda and Shahin [3] implemented a system for student face detection
systems can use the features obtained by taking input as a fingerprint, iris and recognition using a video stream. Implementation is done using Viola-
of an eye or face of the person. Among all types of biometrics face of the Jones Algorithm along with Haar cascade filters for face detection and
person gives unique features along with ease of use considering the camera recognition. Upon matching of the captured face image features with the
as a sensor. student’s recorded features , attendance is marked in an Excel sheet.
As an image processing application along with machine learning Dmello et al. [4] proposed an attendance management system based on
techniques, this proposed system is an automated system which uses face detection and recognition of s tudent’s faces present in a class and mark
features for student recognition and automatically marking his/her recognized students’ attendance automatically. They used IOT cameras to
attendance. Student face image is captured using a camera. Machine capture the students’ faces. CNN, along with SVM classifier is used for
learning techniques are widely used for classification of the data. face recognition. The system gives an accuracy of 94%. Rekha and
Classifiers learn the pattern of the students face features during training Ramprasad [6] proposed a system which automates marking of the
and correctly identify the student under testing. Classifiers are broadly attendance by face recognition method using Eigenface database along
divided into two categories namely supervised and unsupervised learning. with the PCA technique. The database contains the images of 15 different
K Nearest Neighbors (KNN), as well as Support Vector Machine (SVM), persons with 10 images per person. A similarity score is calculated
is the example of the classifier which comes under supervised learning between the new face detected and images present in the database. If the
category. The proposed system works in a controlled environment, where similarity score is greater than 0.3 face is recognized and the Excel sheet
students face features are captured through the camera, and upon gets updated. Vyas and Shah [8] have done a comparison between
recognition, their attendance is automatically marked in an Excel sheet. techniques for extraction of features applied to the recognition of face
The system detects Student’s face using the Viola-Jones algorithm, using LDA and PCA with different facial expressions under different
whereas Linear Discriminate Analysis (LDA) is used for feature selection illumination conditions. They used the Yale database for the analysis
along with KNN and SVM for face recognition. For a class of 50 students, purpose. They showed that LDA is more efficient when applied to the

978-1-7281-5461-9/20/$31.00 ©2020 IEEE 542

Authorized licensed use limited to: University of Canberra. Downloaded on October 26,2020 at 00:09:54 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Smart Electronics and Communication (ICOSEC 2020)
IEEE Xplore Part Number: CFP20V90-ART; ISBN: 978-1-7281-5461-9

recognition of facial features in case of Yale database of images. Using


LDA achieved accuracy is 74.47 % whereas PCA has 72.72% accuracy.
Hence the proposed system gives us the result that LDA is better than PCA
and gives more accuracy. From the literature studied, it is concluded that
for face detections Viola-Jones, Eigenface values and Haar cascade
algorithms are used by most of the researchers. For face recognition, LDA
and PCA [5, 7, 9, 10, 11, and 13] along with CNN, SVM [12] algorithms
are in use currently. The highest accuracy achieved is 95%.

III. Proposed system


Proposed system architecture for automatic students’ attendance Fig. 2 Input image histogram
marking is as shown in Fig.1. Students’ face images are to be captured Before Histogram Equalization
using the camera. It consists of the following stages.
Image Acquisition: Our proposed system uses a camera mounted at a
proper place for acquiring students’ face images in a controlled
environment.
Image Preprocessing: For image preprocessing, histogram equalization is
used to enhance input image quality.
Face Detection: Face is detected from the image using Viola-Jones and
HAAR Cascade algorithm.
Feature Extraction: The features are extracted and the feature vectors
dimensionality is reduced using the LDA algorithm.
Face Recognition: With the help of LDA, SVM and KNN are used to
classify images for face recognition purpose.
Mark Attendance: If a match is found in the database, then it will
automatically mark the student’s attendance in the Excel Sheet, as per
Fig. 3 Input image histogram
lecture name and time. Else, it will display an error and attendance will not
After Histogram Equalization
be marked.

After performing histogram equalization, the images are stored in the


database are in gray scale format as shown in Fig.4 and further operations
are done on these images.

Fig. 4 Dataset
The database is created by taking images of 5 students . For each student, 30
images with different expressions and lighting conditions are taken. The
Fig. 1 System Architecture
images are having a dimension of 92*112 and are stored in PNG format.
Dataset is split into 75% training images and 25% images for testing.
IV. Algorithms Used
The proposed system uses the following algorithms.
B. Face Detection
A. Image preprocessing:
Viola-Jones Method for face detection
Histogram Equalization
The Viola-Jones algorithm was developed by Paul Viola and Michael Jones
Image preprocessing is the process in which the operations are performed
in 2001. It is a framework that is used for real-time detection of image
on the images, to enhance the important image features, to reduce unwanted
features [14] [15]. Detection happens by searching an image using multiple
alterations or extract useful information from the images. For enhancing the
size windows and sliding these windows with different step sizes for Haar
contrast of the captured image, histogram equalization is used. Histogram
features. Haar features consist of different shapes indicating area of light
equalization is done using OpenCV library functions with Python. The input
and dark pixels. Detected objects are defined by Haar features. Adaboost
image is RGB and the output is the high contrast RGB picture in which the
algorithm has built-in classifiers used for training on each and every feature
values of R, G and B gets equalized. Since it is performed on the RGB
within an image. It classifies the faces into positive or negative classes
image, the color balance is also changed. Histogram equalization is
depending on a particular threshold value. Features are grouped efficiently
performed as shown in Fig. 2 and 3 respectively.
by means of cascade classifiers. Face detection is done using OpenCV
Steps to perform Histogram equalization:
library functions for the Viola-Jones algorithm. Fig. 5 shows the step of
1. Histogram calculation of an input image
face detection.
2. Normalize the resulting histogram
3. Transformation of an input image to an enhanced image output

978-1-7281-5461-9/20/$31.00 ©2020 IEEE 543

Authorized licensed use limited to: University of Canberra. Downloaded on October 26,2020 at 00:09:54 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Smart Electronics and Communication (ICOSEC 2020)
IEEE Xplore Part Number: CFP20V90-ART; ISBN: 978-1-7281-5461-9

vector illustration of the things, and then compare the vectors using an
acceptable distance metric (like the geometrician distance).
Images can be compared using various distance metrics, which measure
the distances between image vectors to find similarities amongst the
images. The Euclidean distance metric is a common metric used in
KNN to find the distance measure between data points.
The Euclidean distance between two pixels x and y can be calculated
by the following equation (4):

Fig. 5 Detection of face with Viola Jones d(x,y)=ඥሺ ‫ݔ‬ଵ െ ‫ݕ‬ଵ ሻଶ ǥ ǥ Ǥ Ǥ ሺ‫ ݔ‬௡ െ ‫ݕ‬௡ ሻଶ ------(4)

C. Feature Extraction For implementing KNN, sklearn library and Kneighboursclassifier()


Linear Discriminate Analysis (LDA) library function is used. After applying different values of k, the
LDA is an algorithm that is used to perform dimensionality reduction on highest accuracy value for k is k=5. Output of KNN classifier is as
data while maintaining most of the class discriminatory information. In shown in Fig. 6.
LDA, image vectors are projected into a new subspace to find new
projection axes which best discriminate classes. The new subspace has a
lower dimension than the original image space, thus reducing
dimensionality. To discriminate amongst the classes, it aims to maximize
the between-class scatter (Sb ) and minimize the within-class scatter (Sw ).
LDA is used to reduce the number of features used for classification.
Recognition is performed by the following steps:
x Project faces onto LDA space.
x To classify the face run a Nearest-neighbour classifier.
x The nearest neighbour is the final identified class.

Sw and Sb are defined as:


ே௝ ௝ ௝
ܵ‫ ݓ‬ൌ  σ஼௝ୀଵ σ௜ୀଵሺ‫ ݔ‬௜ െ Ɋ௝ ሻ ሺ‫ ݔ‬௜ െ Ɋ௝ሻ ܶ ------------------ (1)
th
Where xi is the i sample of class j, μj is the mean of class j, C is the number Fig. 6 Face Recognition
of classes and Nj is the number of samples in class j.
b. Support Vector Machine (SVM)
SVM belongs to the supervised learning class of machine learning. SVM is
ܵ௕ ൌ  σ஼௝ୀଵ ሺɊ௝ െ ɊሻሺɊ௝ െ Ɋሻܶ--------------------------- (2)
a classification algorithm that finds a hyperplane that distinctly separates
the data points of two classes in a feature space. The data points that are
Where μ represent the mean of all classes.
closest to the hyperplane are called support vectors. The distance between
The set of eigenvectors present in the LDA subspace is represented by W
the support vectors and the hyperplane is calculated to get a margin. The
where W = [W 1 , W 2,……, W d ]. W is the optimal projection which satisfies
larger this margin, the better the separation achieved by the hyperplane,
the following equation:
resulting in better classification of the data. For SVM classifier
implementation sklearn library is used. The Kernel type implemented is the
ௐ೅ௌ್ௐ
ܹ ൌ ܽ‫ ݔܽ݉݃ݎ‬ቄቚ ቚቅ ----------------------------------(3) Radial Basis Function (RBF) which has a degree of 3. The ‘class_weight’
ௐ೅ௌೢௐ
parameter is set to Balanced. The ‘gamma’ parameter of Support Vector

Steps for performing the linear discriminant analysis (LDA): classifier is set to 0.0001 which gives the highest accuracy of 95% on our

1. Calculate the class-wise mean vectors for all classes in the database. The gamma parameter is used to define the kernel coefficient for

database. RBF kernel.

2. Calculate the within-class scatter matrix (Sw ). Steps for performing SVM:

3. Calculate the between-class scatter matrix (Sb ). Step1: Prepare and format dataset

4. Calculate the eigenvectors and eigenvalues according to the scatter Step2: The dataset is normalized

matrices. Step3: Activating function is selected

5. Sort the obtained eigenvalues in descending order and sort the Step4: Parameters c and g are optimized using a search algorithm after

corresponding eigenvectors accordingly. Select the eigenvectors cross-validation

associated with the largest eigenvalues to obtain the transformation Step5: SVM network is trained using training data

matrix W. Step6: SVM network is validated with testing data

6. Transform and project the image samples onto the new lower Step7: System performance is evaluated

dimension LDA subspace with the help of transformation matrix


W. V. Experimentation and Results
In the first part of this section system, GUI and system implementation

D. Face Recognition steps are shown. The second part gives an experimental analysis of the

a. K- Nearest Neighbour (KNN) algorithms.

KNN belongs to the supervised learning class of machine learning.


KNN is usually employed in search applications wherever you're System Implementati on-

looking for “similar” things. The similarity is measured by making a 1. There are two login sections. One is for students and the second is for
teachers.

978-1-7281-5461-9/20/$31.00 ©2020 IEEE 544

Authorized licensed use limited to: University of Canberra. Downloaded on October 26,2020 at 00:09:54 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Smart Electronics and Communication (ICOSEC 2020)
IEEE Xplore Part Number: CFP20V90-ART; ISBN: 978-1-7281-5461-9

Fig. 7 Login Account Login

2. On successful login following window will appear for teachers to


Fig. 10 LDA+SVM Heatmap
perform operations like add student entry, then it will show student
database, finalized attendance sheet.

Fig. 8 Teacher Login


Fig. 11 LDA+KNN Heatmap
3. Student’s attendance marking is done automatically by activating
the system from the starting time of the lecture to till 15 minutes Performance analysis for KNN and SVM classifiers implemented in our
duration. system is as shown in table 1.

4. Attendance will be marked automatically in the excel sheet as Table 1: Comparison of Classifier performance metrics between KNN and
shown in Fig. 9 below. SVM classifier

Metric KNN SVM

Precision 96% 93%


Recall 97% 95%
Accuracy 97% 95%

From the given Heatmap, it is concluded that LDA along with KNN gives
better results than a combination of LDA and SVM. The accuracy of
LDA+KNN is 97% whereas LDA+SVM it is 95% accuracy. So the
observation says that KNN gives more accuracy on our database than the
SVM.
The given fig. 12 explains the comparison of classifier performance metrics
using a plotted graph.

Classifier Performance Metrics


100
Fig. 9 Complete attendance sheet generated 98
Values in %

Experimental analysis - 96

94 KNN
The total number of 150 images are there in a dataset. For training purpose,
92 SVM
112 images and for the testing purpose, 62 images are used. Following
90
figures show the Heatmap for the combination of LDA plus SVM
classifiers and the combination of LDA plus KNN classifier respectively. 88
Precision Recall Accuracy

Fig. 12 Classifier Performance Metrics

978-1-7281-5461-9/20/$31.00 ©2020 IEEE 545

Authorized licensed use limited to: University of Canberra. Downloaded on October 26,2020 at 00:09:54 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Smart Electronics and Communication (ICOSEC 2020)
IEEE Xplore Part Number: CFP20V90-ART; ISBN: 978-1-7281-5461-9

[6] Rekha, E., Ramaprasad, P., "An efficient automated attendance management system based
on Eigen Face recognition", 7th International Conference on Cloud Computing, Data Science
Table 2: Comparison of TP, TN, FP, FN values for KNN and SVM
Engineering - Conuence, 2017.
Classifier
[7] Borra Surekha, Kanchan Jayant Nazare, S. Viswanadha Raju and Nilanjan Dey ,
Figures of Merit KNN SVM "Attendance Recording System Using Partial Face Recognition Algorithm", Springer
International Publishing Switzerland 2017.
TP 37 36 [8] Riddhi A. Vyas and Dr. S.M. Shah.," Comparison of PCA and LDA T echniques for Face
Recognition Feature Based Extraction With Accuracy Enhancement", International Research
TN 151 150 Journal of Engineering and T echnology (IRJET ), 2017.
[9] Anshun Raghuwanshi and Dr. Preeti D Swami, "An Automated Classroom Attendance
FP 1 2
System Using Video Based Face Recognition", IEEE International Conference On Recent

FN 1 2 T rends in Electronics Information & Communication T echnology (RT EICT ), 2017.


[10] Sajid M., Hussain R., Usma M. , "A conceptual model for automated attendance marking
system using facial recognition", Ninth International Conference on Digital Information
The above table 2 gives the comparison of TP,TN,FP,FN values for KNN Management, 2016.
[11] Nawaf Barnouti, Mustafa Naser, Sinan Al-Dabbagh, Wael Matti, " Face Detection and
and SVM classifier.
Recognition Using Viola-Jones with PCA-LDA and Square Euclidean Distance", (IJACSA)
International Journal of Advanced Computer Science and Applications, 2016.
[12] Chintalapati, S., Raghunadh, M. V., "Automated attendance management system based on
True Positive and False Negative Values face recognition algorithms", IEEE International Conference on Computational Intelligence
40 and Computing Research, 2013.
35 [13] T omesh Verma, Raj Kumar Sahu, " PCA-LDA Based Face Recognition System and
No. of samples

30 Results Comparison By Various Classification Techniques", International Conference on


25 Green High Performance Computing , 2013.
20 [14] Viola, P., Jones, M., “ Rapid object detection using a boosted cascade of simple features”,
15
TP
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern
10 FN
Recognition, 2001
5
[15] Viola P., Jones M. J., “ Robust Real-T ime Face Detection“ , International Journal of
0
Computer Vision, 57, 137–154, 2004
KNN SVM
[16] Smit Hapani, Nandana Prabhu, Nikhil Parakhiya, Mayur Paghdal, "Automated Attendance
Classifier System Using Image Processing", 2018 Fourth International Conference on Computing
Communication Control and Automation (ICCUBEA), 2018
Fig. 13 Comparative study of TP, FN Values for KNN and SVM classifier

Fig. 13 shows the comparison of TP, FN values for KNN and SVM
classifier which is plotted in the graph.

VI. Conclusion
The designed system which when implemented could greatly impact the
way attendance is marked in schools and colleges. The proposed system
explored image processing along with machine learning algorithms and
techniques ensuring high accuracy and efficiency. The system does the
automatic marking of the s tudent’s attendance based on detection and
recognition of students using face image features. Face detection is
implemented by using Viola-Jones algorithm. For face recognition, LDA is
implemented with KNN and SVM classifiers. It is concluded that LDA
along with KNN gives high accuracy on our database. One of the reasons
for the increase in accuracy as compared to existing papers mentioned in the
literature survey is more number of facial expressions are taken for each
student. As a future work following improvements is suggested.
1. Parallelism can be applied to reduce the system's time complexity.
2. Algorithms can be tested on large datasets
3. An in-built mechanism can be added to send a notification via
phone/email to a student's parents once his/her attendance falls below
a certain threshold.

References
[1] Sawhney S., Kacker K., Jain S., Singh S. N., Garg R. , "Real-T ime Smart Attendance
System using Face Recognition T echniques", 9 th International Conference on Cloud
Computing, Data Science Engineering, 2019.
[2] Sarika Sovitkar and Seema Kawathekar, “ Comparative Study of Feature-based Algorithms
and Classifiers in Face Recognition for Automated Attendance System" , International
Conference on Innovative Mechanisms for Industry Applications (ICIMIA), 2020.
[3] Matilda, S., Shahin, K., "Student Attendance Monitoring System Using Image Processing",
IEEE International Conference on System, Computation, Automation and Networking, 2019.
[4] Royston Dmello, Sai Yerremreddy, Samriddha Basu, T ejas Bhitle, Yash Kokate, Prachi
Gharpure, "Automated Facial Recognition Attendance System Leveraging IoT Cameras", 9th
International Conference on Cloud Computing, Data Science & Engineering (Confluence),
2019
[5] Louis Mothwa, Jules-Raymond T apamo, Temitope Mapayi "Conceptual Model of the
Smart Attendance Monitoring System Using Computer Vision", International Conference on
Signal-Image T echnology & Internet -Based Systems (SIT IS), 2018.

978-1-7281-5461-9/20/$31.00 ©2020 IEEE 546

Authorized licensed use limited to: University of Canberra. Downloaded on October 26,2020 at 00:09:54 UTC from IEEE Xplore. Restrictions apply.

You might also like