AI Based Proctoring

2021 3rd International Conference on Advances in Computing, Communication Control and Networking (ICACCCN)
AI Based Proctoring
2021 3rd International Conference on Advances in Computing, Communication Control and Networking (ICAC3N) | 978-1-6654-3811-7/21/$31.00 ©2021 IEEE | DOI: 10.1109/ICAC3N53548.2021.9725547
Shakti Priya Saurav Pranay Pandey Shubham Kr Sharma

Computer Science and Engineering Computer Science and Engineering Computer Science and Engineering
Dronacharya Group Of Institutions Dronacharya Group Of Institutions Dronacharya Group Of Institutions
Greater Noida, Uttar Pradesh Greater Noida, Uttar Pradesh Greater Noida, Uttar Pradesh
hello@shaktisaurav.com topranaypandey@gmail.com amshubhamsharma@gmail.com
Bipin Pandey Rajat Kumar

Computer Science and Engineering Department of Computer Science and Engineering
Dronacharya Group Of Institutions Dronacharya Group Of Institutions
Greater Noida, Uttar Pradesh Greater Noida, Uttar Pradesh
hodcse@gnindia.dronacharya.info rajat.kumar@gnindia.dronacharya.info
Abstract --- From the past year, Online Examination has While our AI-based Proctoring System invites, many
become popular in all the educational fields due to covid19 and detection mechanisms, like Face detection, Noise detection,
its flexibility. However, the institutions and community confront Gaze Tracking, mouth movement detection, mobile or other
a big difficulty in terms of proctoring methods, as it is extremely device detection, head pose detection to find where the
difficult to administer cheating-free online exams. Here we examinee is looking, change of tabs detection, together with
present techniques and tools through which the proctor need not these all can facilitate the fairness of examination and add
to be present throughout the exam. This is based on neural credibility and integrity to it.
networks and machine learning. Our AI-based model will be able
to detect any unfair means in an examination. Our experiments II. LITERATURE REVIEW
proved that the proposed system is better than the existing ones.
In [1] published by Asep Hadian S. G and Yoanes Bandung,
Keywords --- Neural Network, Convolutional Neural Networks followed a unique approach. A very large data-set of images is
(CNN), Deep Learning, Proctoring, Covid19 used to train and identify the user in low light and general
scenarios. It was not designed for first time setup for an online
I. INTRODUCTION examination.
Exams are a crucial part of every educational programme, The main focus of Aiman Kiun's presentation [2] was on
whether it be online or offline. Cheating is more likely to occur fraud discovery in doing the video recordings during exams by
in online tests, thus detecting and preventing it is critical. using CNN, with image classification models based on the use
of Rectified activation units (RAU).
AI-based proctored Online examinations are gaining more In [3] by Sanjana Yadav and Archana Singh, where they
importance due to their flexibility, security, and comfort. This used computer vision for information extraction for object
is not only limited to college and school examinations but can detection. Image is checked with a matching algorithm using
also help online certification to be more valid. Instead of the methods such as re-scaling, filtration and binarization.
conduction examination in offline mode with the use of our AI Chamfer distance transformation.
proctored online examination, we can save much human effort In [4] by Swathi Prathish, He proposed four major channels
and time while also giving comfort to the examinee. According of detection, System usage analysis, Video analysis, Audio
to the UNESCO Educational Disruption and Response to analysis, Inference system. This evaluates multiple faces
covid19 pandemic, many of the governments are closing down detection.
educational institutions and are significantly shifting to online In [5] by N.L Clarke and P. Dowland, presents a model to
activity and remote working impacting over 85% of the world’s facilitate remote and electronic proctoring during examinations
student population. Technological advancements can have of students. They used translucent recognition to give a non-
remote exams as an alternative to in-person proctoring. disruptive and persistent authentication of student’s identity.
Moreover, whether AI-Based Proctoring is equal in function to In [6] by Vahid Kazemi and Josephine Sullivan, proposed
physical proctoring is questionable, but this is the future of the Face Arrangement for a solitary picture showing how utilizing
examination system as Academics have shifted to online mode fitting priors misusing the design of picture information assists
and their number will be increasing post covid19 also. This with effective component choice. Moreover, they examinee the
presents a significant difficulty, not just in terms of learning but accurate predictions on the quantity of available training data.
also in terms of testing. In [7] by A.T. Awaghade et.al, proposed that all the
contributions to measure and gauge the assortment of occasions,
Conducting examinations without committing any errors is practices and examples ordinarily connected with cheating.
a huge challenge that must be overcome. A faculty member can
III. PROPOSED METHODS
physically supervise students with all of their senses during
offline proctored tests at the centres. They can readily detect In this case, we propose developing a comprehensive system
student movement or any sound in the room, ensuring a smooth with numerous detection mechanisms capable of detecting any
examination. Because the teacher is not physically present at unethical behavior.
the location, online tests limit oversight. The AI-based
proctoring system should allow for easy movement and sound Before the start of Online Exam
detection on all students at the same time. The examinee will have to verify himself before the test. At
this stage, the system will collect all the facial data of the
ISBN: 978-1-6654-3811-7/21/$31.00 ©2021 IEEE
610
Authorized licensed use limited to: United Arab Emirates University. Downloaded on April 27,2022 at 07:54:24 UTC from IEEE Xplore. Restrictions apply.
examinee. Then the system will extract all the data and use it examination has constrained resource like web-cam so the
for the proctoring. TensorFlow lite was most suitable for this situation.
• The TensorFlow model gives ~8 FPS and the landmark

Exam Phase
prediction step takes around 0.06 seconds
The examinee takes the exam while our technology • Dlib gives ~12 FPS and the landmark prediction step
monitors it in real time for unfair means detection. To capture takes around 0.006 seconds
the examinee's audio-visual signals, our system will only
require two sensors (webcam and microphone). Dlib looks like better option but the overall result with
Five components will process the audio-video streams to accuracy was far better with Tensor Flow.
extract middle-level information. These traits are then
combined to create our main classifier. Part-dependent features
are mean and (SD) standard deviation within a pane, and B. Eye Tracking (v2)
features based on component correlation are among the features We used a facial key-points detector to detect eyes in real-
generated. time. For this we used a already trained network in dlib library
which is capable of detecting 68 key points. It takes a
rectangular object of the dlib as input which is the coordinates
Vision Based Functionality used
of a face.
• Gaze tracking.
• Find if the candidate opens his mouth.
• Find any instances of mobile or other devices.
• Head pose estimation.
• Face detection
A. User Verification & Facial Landmarks (v1)

A proctoring system should be able to continuously verify
if the examinee is the same person who he claims to be. There
are various methods for continuous user verification, we used
facial recognition in our system.
Figure 3. Dlib facial keypoints. [9]
C. Mouth Opening Detection (v3)

We proposed this system to check if the examinee opens
his/her mouth to say something during the exam. This was
Figure 1. Metric for OpenCV DNN similar to eye detection. Again we used the same dlib facial
key-points and for this task examinee is required to sit straight
We compared different face detection models. OpenCV's and the distance between the key-points (5 outer pairs and 3
DNN module provides best result (Figure 1). This module inner pairs) is checked for 100 frames. If the examinee opens
employs the ResNet-10 architecture and is based on the Single his mouth the distances of the points increases and if the
Shot Multibox detector. (Figure 2) increase in distance is more than a fixed value for at least three
outer pairs and two inner pairs then vector is generated.
D. Mobile & Device Detection (v4)
We used Tensorflow, OpenCV, and wget-python. We used
instance segmentation which can distinguishing between
different labels and also able to distinguish between multiple
objects of that label. We used [8] YOLOv3 already trained
model to classify 80 objects. It uses Darknet-53 which has 53
convolutional layers and is 1.5 times faster than previous
versions.
Figure 2. SSD.
At starting we used dlib’s facial landmarks model but the

expected result was not obtained so we used facial landmark
detection based on convolution neural network. This model
was built using Tensor Flow. As our system for online
611
downloading any additional software. Using JS we achieved
this (Figure 6) and store a count of this practice and a timer along
with that. Doing this will generate the notification for examinee
and with repeat attempt the exam will be auto submitted or the
examinee will be disqualified.
G. Audio (v7)
Audio is one of the main sources to monitor during an
online examination. That’s why we added this to our AI Based
Proctoring system. We took the audio of the microphone and
converted to text using Google Speech Recognition. Further
using NLTK, we remove the stop-words from the text file.
Question Paper was also processed in the same way and then
common words are found out and reported.
IV. UNFAIR MEANS DETECTION
After the output of all basic components (v1,v2,v3,v4,v5,v6,v7)
where all the vectors have same sampling rate, i.e; one element
per frame, We now at this stage try to further classify the
Figure 4. YOLOv3 runs much faster than other detection methods with a output as Unfair Means or Fair Means. The vectors ( v 1 & v6 )
comparable performance. are used directly to provide a unfair means decision, and other
five vectors be utilized for extracting features, which will be
E. Head Pose Detection (v5) used for classifier to make continuous decisions on unfair
This is proposed to find where the user is looking. This can means. While v7 output is used to check whether the same
be very beneficial to detect if user is trying to cheat by looking words are used by the examinee or not. Depending on that it
at some additional display or devices. can determine this as unfair means.
We tried implementing this using OpenCV and

Feature Extraction
TensorFlow. We used a Caffe (Convolutional Architecture for
Fast Feature Extracting) model of OpenCV’s DNN module as Because unfair means occur over a period of time, features
it provided the most accurate results (Figure 5). must be defined according to the temporal window. For the
purpose of feature extraction, we specify this with a set time in
The first step was to detect the face and for this we used the seconds. We create a variety of segments. We can convert this
same facial landmark detection based on convolution neural labelling to the label of each segment if we manually label fair
network. We took 3D coordinates of the facial landmarks and vs unfair for all gathered frames at each second. That is, the
tried to estimate the rational and translational vectors at the majority of per-second result labels within a segment decide the
center tip(nose).
binary outcome of a segment.
Figure 5. Caffe Model Accuracy.
After getting the required vectors we projected those 3D

points on a 2D surface that is our image.
F. Active Window Detection and Tab Switching (v6)

The cheating from the internet from the same device is one
of the most frequent unfair means among e-learners. is is
proposed to find where the user is looking at. For this we used
Figure 6. Unfair Means Detection.
JavaScript as we wanted this to be a system without
612
V. CONCLUSION [3] “An Image Matching and Object Recognition System using Webcam
Robot”, Sanjana Yadav; Archana Singh, 2016 Fourth International
The purpose of this study is to describe an AI-based system for Conference on Parallel, Distributed and Grid Computing (PDGC), 22-24
online exam proctoring in order to maintain academic integrity. Dec. 2016.
The system is simple and convenient to use from the [4] “An Intelligent System For Online Exam Monitoring”, Swathi Prathish,
perspective of the test taker, as it only requires two inexpensive Athi Narayanan S, Kamal Bijlani, 2016 International Conference on
cameras and a microphone. Using the captured films and audio, Information Science (ICIS), 12-13 Aug. 2016
we extract low-level information from seven essential
[5] “e-Invigilator: A biometric-based supervision system for e-Assessments”,
components: user authentication, eye tracking, speech N.L Clarke, P. Dowland, S.M. Furnell International Conference on
recognition, and so on. These data are then analyzed in a Information Society (iSociety 2013), 24-26 June 2013.
temporal window to get decision-making traits, which are then
[6] “One Millisecond Face Alignment with an Ensemble of Regression
used to detect cheating. detection. Trees”, Vahid Kazemi,Josephine Sullivan, 2014 IEEE Conference on
Computer Vision and Pattern Recognition,23-28 June 2014.
REFERENCES
[7] “Online Exam Proctoring System”, A.T. Awaghade, D. A. Bombe, T. R.
[1] “A Design of Continuous User Verification for Online Exam Proctoring Deshmukh, K. D. Takawane, International Journal of Advance
on M-Learning”, Hadian S. G. Asep; Yoanes Bandung, 2019 Engineering and Research Development (IJAERD) “E.T.C.W”, January
International Conference on Electrical Engineering and Informatics -2017, e-ISSN: 2348 - 4470, print-ISSN: 2348-6406.
(ICEEI), 9-10 July 2019.
[8] “YOLO3 Real-Time Object Detection” by Joseph Redmon, Ali Farhadi
[2] “Fraud detection in video recordings of exams using Convolutional published by Cornell University on Apr 8, 2018
Neural Networks”, Aiman Kuin, University of Amsterdam, June 20,
2018. [9] “Facial landmarks with dlib, OpenCV, and Python” by Adrian
Rosebrock on Apr 3, 2017
613

AI Based Proctoring

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AI Based Proctoring

Uploaded by

Copyright:

Available Formats

2021 3rd International Conference on Advances in Computing, Communication Control and Networking (ICACCCN)

Shakti Priya Saurav Pranay Pandey Shubham Kr Sharma

Bipin Pandey Rajat Kumar

• The TensorFlow model gives ~8 FPS and the landmark

A. User Verification & Facial Landmarks (v1)

Figure 3. Dlib facial keypoints. [9]

C. Mouth Opening Detection (v3)

At starting we used dlib’s facial landmarks model but the

We tried implementing this using OpenCV and

Figure 5. Caffe Model Accuracy.

After getting the required vectors we projected those 3D

F. Active Window Detection and Tab Switching (v6)

You might also like