You are on page 1of 14

ECEN 403: Electrical Design Laboratory I

Personalized Quantification of Facial Normality

using Artificial Intelligence

Team Members:

Layan Al-Huneidi

Salma Aboelmagd

Khalid Al-Emadi

Sara Mohamed

Mentor:

Dr. Erchin Serpedin

“An Aggie does not lie, cheat, or steal, or tolerate those who do.”

1
Abstract

A facial reconstructive surgeon faces the monumental undertaking of deciding what is to

be considered a normal face. A machine learning program has been recently developed to answer

the question of what a normal face is. It utilizes a supercomputer to run an algorithm, written in

Python, on an image representing an abnormal face and compares it against a database to

reconstruct an image of said face as a normal one. However, such a program is not quite

convenient since it requires access to a supercomputer, as well as a lengthy period of time to run

and generate the normalized version of a given facial image. The premise for our project is to

further improve this artificial intelligence-driven technology and adapt it into a software

application that can be implemented on a smartphone or tablet and run in a fraction of time. The

program would be fed with enough data sets to be able to quantify on a certain scale the degree

of abnormality. This project would greatly benefit reconstructive surgeons around the world as it

can be used as an objective instrument for surgical planning. In addition, patients can track in an

objective manner the progress of their reconstructive surgeries.

2
Table of Contents

Abstract Error! Bookmark not defined.

Introduction: 4

Problem Statement and Objective 6

Computing Power 6

Interpreting Facial Images 6

Methodology 7

Proposed Solution 7

What are the Stages that the Project will undertake? 8

Breakdown of the Proposed Project into Tasks 9

Estimated Budget and Justification 11

Timeline 12

Conclusion 13

References 14

3
Introduction:

Facial aspect has a great effect in the social communications between individuals. Facial

expressions -greatly influenced by facial structure- serve as a sort of visual aid for social

interaction. A frown helps convey negative emotions while a smile can be used to show ease and

satisfaction. As a result, facial abnormalities acquired from birth, also known as congenital facial

disfigurement, tend to obscure certain facial expressions which results in a degradation of the

quality of discourse and colloquy [1]. It is also worthy to note the great psychological and social

impact that a facial disfigurement would pose on an individual. Due to the previously mentioned

effects of congenital facial abnormalities, individuals seek to reconstruct or mend such

irregularities in their fetuses through surgical procedures. From a surgeon’s point of view, the

difficulty lies in the attempt to produce a non-biased, objectively ‘correct’ form of the facial

structure.

Currently available solutions to correct facial abnormalities in fetuses are based on

ethnocentric standards of beauty and what ‘normal’ really means in a specific part of the world

and in a specific time period. Eurocentric beauty standards, for example, are based on smaller

features like a pointed nose and a softer lip shape. On the other hand, African beauty ideals

revolve around more prominent features. Such subjective procedures tend to raise questions of

morality on the surgeon’s end. Is this what normal should actually look like? Other solutions

include collecting a large database of 2D and 3D facial photographs and judging the photographs

to obtain a ‘normality’ score [2]. The problem with this solution is that the corrected facial

structure should also complement the other facial features of the individual which are influenced

by gender, ethnicity, and other factors.

4
Currently, there exists a program that has been recently developed which is used to

‘score’ facial deformity using a scan of the face. The program is based on a deep learning

algorithm which is trained on a large set of images and is used to produce a corrected form of the

disfigured face [3]. The catch in this program is that it consumes a great deal of power and time

in order to produce the corrected image. Since the program is run on a very powerful

supercomputer, the power consumed is extensive and inefficient for constant use by the surgeon.

In addition, the time it takes for the corrected image to load makes it impractical to rely on.

For that reason, new methods were implemented that can alternatively assess the indexes

of normality using different tools. The goal of this project is to compare two proposed alternative

design approaches and assess their implementation complexity. In addition, it is necessary to

decide on the best modality to be used accordingly and run it on an application where the doctors

or surgeons can remotely access and control in real-time through their cell phones.

The final generated prototype should be able to scan human faces easier and to create a

larger variety of facial features to maximize the eligibility and realistic-ness of the generated

faces.

The development of this new technology into a clinically accessible, portable platform

will usher in new opportunities for multicenter collaboration and objective comparison of

outcomes between conditions, techniques, operators, and institutions [4].

In the upcoming sections of this project proposal, we will discuss the problem statement

by identifying the root cause for conducting the project, list the objectives, explain the

methodology we will be utilizing, layout our proposed project timeline using a Gant chart, and

finally conclude our project proposal.

5
Problem Statement and Objective

Using machine learning and artificial intelligence techniques, the software application

must be able to generate an image of a face with no deformities from an input of facial features

with deformities. This code will be used by doctors and surgeons to generate a normalized image

for them to use as a reference during reconstruction surgery. Because of that, the code must run

quickly and smoothly on devices such as smartphones and ipads that are readily available. The

goal here is to optimize the code so that it can be run on smartphones.

Computing Power

One of the problems that we face is the fact that running the code heavily depends on the

available computing power. With the most powerful laptop on the market at the moment, the

code may take several hours to run. Even while using a supercomputer, the code is not

instantaneous enough to be used at the speed required by surgeons and doctors. The goal here is

to come up with an application that can be run by using smartphones. The best smartphones on

the market do not have the required computing power to run the code. We also have to put into

consideration that not every doctor or surgeon has the best smartphones on the market.

Therefore, the code has to be optimized to run on phones at least three generations before the

flagship models. This alone comes with its own challenges.

Interpreting Facial Images

Computers do not have the same level of perception as humans do. When the code was

run, there were times where the code focused more on the “normal” parts of the face rather than

just the “imperfections”. The problem here is that the code cannot differentiate between the

6
facial features of different races. It is also more influenced by the difference of the age between

the sampled “normal” face and the face with the deformity [3].

Methodology

Proposed Solution

The code was implemented in Python 3.6 by using sklearn, scipy, keras and tensorflow

[3]. The goal here is to find a programming language that is not computing-power dependent as

Python is. Different implementations of the code will be run through languages like “C”, “C++”

and “Java”. It is expected that “C++” will be the most suitable language for our project as it is

not as computing intensive as python and is not as bare as “C”. Since the code takes at least 20

minutes to compile on a supercomputer, we had to order one of the best laptops on the market.

That way, we will be able to test and develop the code further at a faster rate. The laptop we will

be using is a Razer Blade 15 2021, sporting an Intel Core i9-11900H 8-Core CPU, an NVIDIA

RTX 3080 16GB graphics card and 32GB of DDR4 RAM. This laptop is the top-of-the-line

model sporting every flagship model of computer parts. Also, for quicker and smoother work-

flow, we decided to order an external SSD to backup large files. If we were to use cloud storage,

uploading the files to the cloud and downloading them would take an unreasonable amount of

time. We decided on going for the 1TB Sandisk portable SSD. The laptop and the external SSD

will ensure the most efficient workflow of the project. After optimizing the code to be able to run

fast enough on the laptop, we will begin optimizing the code to be able to run on smartphones.

Smartphone OS providers have developer kits for software engineers to create apps on their

7
platforms. We will be using those developer kits to translate the code from the language; we

found it was the fastest route to the language the smartphone OS will run.

Since the machine is not as intuitive and perceptive as humans, it tends to make mistakes

based on race and/or age. A solution to this is to run images that are “normal” through the code

of people of different races. The code will be running on a system that rates images based on

how “normal” the faces are. The rating is set to be between 1-7, with 1 being a normal face and 7

being an abnormal face. That way, the machine can learn to differentiate between an abnormality

and the differences between facial features of age and races. We can also run a survey where

different images are displayed to people where they would rate those images with the same range

the code is set. Once the results are collected, they will be programmed into the code. The data

collected from the survey will help the code have a certain degree of intuition as humans do.

That is because the images are rated from multiple human perspectives, which in turn builds a

foundation on how images should be rated by the code [3].

What are the stages that the project will go through?

The facial recognition and scoring software is based on training a deep neural network by

feeding it with a large database of images. These images will be automatically scored on the

range of 1 to 7 where 1 represents the extremely abnormal case and 7 represents the opposite

extreme. The deep learning algorithm will then be tested and validated by other new images that

the network has not been previously exposed to which will in turn help determine and compute

the error rate. In addition, further validation and error computation will be undertaken by

surveying a selected group of people and asking them to rate the normality score of a set of facial

images.

8
Breakdown of the Proposed Project into Tasks

To attain the outcome of this project there is a sequence of main tasks that need to be

performed.

The first major task is to conduct an extensive literature review on the measure of facial

differences using machine learning and the proposed design approaches. We will need to learn

about GANS and how they operate. GANS can help the doctors, patients, and family members of

patients get an objectively better idea of the outcome of cosmetic surgery. Using GANs can also

give the parents numerical values to the margin of error and similarity regarding the changes in

the patient’s appearance [4].

After acquiring the necessary literature reviews to thoroughly comprehend the specifics

of our assignment, we will start on by comparing the implementation of the different models

proposed. After evaluating each of the models, we will have to plump for the most scalable, real-

time efficient model to be implemented on an application. To do so there are certain tasks that

need to be executed first. Before we carry out the first task, which is the task of classification, we

will need to study and know what ‘normal face’ actually means as it pertains to any given

individual face [3]. Later on, we will be capable of determining or classifying if the face is

normal or abnormal.

Sorting through large numbers of faces requires the recognition and active retention of

vast amounts of perceptual information, something better suited to a machine system than to the

human mind. For that reason, we will need to have a comprehensive understanding of machine

learning technologies and how they operate. In addition, we will need to get familiar with certain

concepts and machine learning techniques to measure the difference between pre-and post-

operative images. Moreover, to help address the stated aims on whether a classification or a

9
regression approach should be undertaken, we will have to perform many computer simulations.

Sorting through a large number of faces requires the recognition and active retention of vast

amounts of perceptual information, something better suited to a machine system than to the

human mind.

Moving on to the third task, we will need to create a database of images collected from

various modalities to make it easier to detect a normal face from an abnormal one. Since the

distinctive facial features of any given patient are not likely to be reflected accurately by a

calculated norm derived from the broader population, we will address this issue by finely

segmenting massive databases into increasingly narrow demographic classifications confounded

by the historical admixture of human populations and a globalized world with increasingly

mixed lineage [3].

The next task is a bit challenging as we will have to train our algorithm to recognize

abnormalities then assemble a large image database and label it finely based on the

distinguishing facial characteristics to be able to detect the degree of deformity.

Moving on to the final step of testing and validation of our algorithm. This will be done

by collecting a large database different from the one used for training, and we will assess how

successful the code is at analyzing the images. During the testing phase, we will compute the

error rate and we will keep improving the algorithm if any errors arise. During testing, the

reconstruction error should be small for normal facial images having similar forming

representations as to the training dataset, while the score should be progressively higher for those

images displaying more dissimilar facial features [3].

If we are given an image of a child, first we will want to classify if the face is normal or

abnormal according to certain characteristics. Moreover, if it's abnormal we will need this

10
application to give it a score/rating and this rating should be somehow close to the rating

provided by a qualified human, like the scoring offered by a surgeon. It will also have to provide

an objective score that reflects as precisely as possible the inputted image.

Estimated Budget and Justification

Equipment Price

Razer Blade 2021 Laptop $3,399.99

SanDisk 1TB Portable SSD $147.65

To be able to run the code, we will need the most powerful laptop available in the market

that is within our budget. The “Razer Blade” laptop has the best graphics card available to

laptops on the market. It also has the flagship CPU provided by Intel, the “i9 11900H”. The more

cores a processor has, the faster the code can run. It also boasts 32GB of RAM which can be

upgraded to 64GB if needed. The fact that the code requires a supercomputer to run shows the

need for an extremely powerful laptop.

The portable SSD is important to back-up any files saved on the laptop. It is there in case

any malfunctions happen to the laptop. Uploading and downloading large files to a cloud will

take plenty of time and it is not guaranteed that it will upload successfully. Having an external

SSD will ensure swift transfers and backups of large files from the laptop.

The total amount spent on the equipment is $3547.64.

11
Timeline

12
Conclusion

In essence, we plan on utilizing the already available supercomputer program used to

automatically score facial images and modify this program to create a more practical mobile

application in order to provide plastic surgeons with an easy and fast way to acquire and generate

corrected facial images from photographs with congenital anomalies.

13
References

[1] Q. Xu, Y. Yang, Q. Tan, and L. Zhang , “Facial Expressions in Context:

Electrophysiological Correlates of the Emotional Congruency of Facial Expressions and

Background Scenes,” Frontiers in Psychology, vol. 8, 2017.

[2] D. G. M. Mosmuller, J. P. W. D. Griot, C. L. Bijnen, and F. B. Niessen, “Scoring systems

of cleft-related facial deformities: A review of literature,” The Cleft Palate-Craniofacial

Journal, vol. 50, no. 3, pp. 286–296, 2013.

[3] O. Boyaci, E. Serpedin, and M. A. Stotland, “Personalized quantification of facial

normality: A machine learning approach,” Scientific Reports, vol. 10, no. 1, Dec. 2020.

[4] O. Atout, Facial Generation and Anomaly Detection Software, 2019. [Online]. Available:

https://omaratout.wixsite.com/facialgenerationapp. [Accessed: 12-Sep-2021].

14

You might also like