You are on page 1of 6



Face Detection System Based on MLP

Neural Network
Nidal F. Shilbayeh and Gaith A. Al-Qudah

Abstract—Face detection is the problem of determining whether there are human faces in the image and tries to make a
judgment on whether or not that image contains a face. In this paper, we propose a face detector using an efficient architecture
based on a Multi-Layer Perceptron (MLP) neural network and Maximal Rejection Classifier (MRC). The proposed approach
significantly improves the efficiency and the accuracy of detection in comparison with the traditional neural-network techniques.
In order to reduce the total computation cost, we organize the neural network in a pre-stage that is able to reject a majority of
non-face patterns in the image backgrounds, thereby significantly improving the overall detection efficiency while maintaining
the detection accuracy. An important advantage of the new architecture is that it has a homogeneous structure so that it is
suitable for very efficient implementation using programmable devices. Comparisons with other state-of-the-art face detection
systems are presented. Our proposed approach achieves one of the best detection accuracies with significantly reduced
training and detection cost.

Index Terms— Face Detection, Neural Network, MLP Neural Network, Training, Learning, MRC.

——————————  ——————————


N owadays, there is an increasing trend of using bio-

metrics information, which refers to the human bio-
logical features used for user authentication, such as

ferently due to changes in lighting.
Facial Expressions: the appearance of faces is directly
affected by a person’s facial expression.
fingerprint, iris, and face. Face detection is becoming one  Presence or absence of structural components: facial
of the most active research areas in computer vision and a features such as beards, mustaches, and glasses may
very important research topic in biometrics; due to its or may not be present and there is a great deal of va-
wide range of possible applications, like security access riability among these components including shape,
control, video surveillance and image database manage- color, and size.
ment, and it is the first step in any face recognition Sys-  Occlusion: faces may be partially occluded by other
tem. objects. In an image with a group of people, some
faces may partially occlude other faces.
Face detection is the problem of determining whether  Image Orientation: face images directly vary for dif-
there are human faces in the image and tries to make a ferent rotations about the axis.
judgment on whether or not that image contains a face.  Image Conditions: when the image is formed, varia-
Given an image, face detection involves localizing all fac- tions such as lighting and camera characteristics af-
es - if any - in this image. It is the first step in any auto- fect the appearance of a face.
mated system that solves problems such as: face recogni-
tion, face tracking and facial expression recognition. Reviewing the literature, it is quite difficult to state that
there exists a complete system which solves face detection
Variations in images increase the complexity of the deci- problem with all variations included. Neural networks
sion boundary between face and non-face classes. Such have shown very good results for detecting a certain pat-
complexity will also increase the difficulty of the problem tern in a given image [1, 2, 11,13,14,15,17]. But the prob-
and make the classification to be harder. These variations lem with neural networks is that the computational com-
can be summarized as follows [3, 9]: plexity is very high because the networks have to process
 Pose: the image of face varies due to the relative face many small local windows in the images [2, 5]. Some au-
pose (frontal, 45 degree, profile, upside down), and thors tried to speed up the detection process of neural
some facial features such as eye or the nose may be- networks [12, 14, 15, 18]. They proposed a multilayer per-
come partially or wholly occluded ceptron (MLP) algorithm for fast object/face detection.
 Illumination: same face, with the same facial expres- They claimed that applying cross-correlation in the fre-
sion, and seen from the same viewpoint appears dif- quency domain between the input image and the neural
weights is much faster than using conventional neural
———————————————— networks. They stated this without any conditions and
 Nidal F. Shilbayeh is with the Middle East University, Amman, Jordan. introduced formulas for the number of computation steps
 Gaith AlQudahS is with the Princess Sumaya University for Technology, needed by conventional neural networks and their pro-
Amman, Jordan. posed fast neural networks.

The main objective of this thesis is to design and imple-

ment face detection system that locate upright frontal  Preprocessing. 
faces on digital color images with high detection rate.  Facial features extraction 
Also, the system able to detect faces of various sizes in
complex background images. We proposed a new me-
thod for improving the detection rate through the use of
cascade classifier. The elegant key behind the cascade is
based on a tree structure of classifiers. In detection task in
a large image a large majority of the sub-windows ob-  
served by scanning the image using the MRC classifier  
will be rejected and just a small regional area(s) in the Fig.2 Cascading MRC and NN Diagram 
image might be target(s). Therefore, the generality of the  
first stage must be sufficiently high to reduce the number In Image enhancement stage, an input image is passed to 
of these false positive sub-windows from processing in
the system for classification. Image varies in format, size, 
the next stage using MLP neural network. The aim of cas-
cading is to provide inference with the lowest possible and  resolution.  The  histogram  equalization  is  performed 
false positives, and highest detection rate. to  enhance  the  contrast  in  the  image.  It  compensates 
changes  in  illumination  in  different  images.  Histogram 
2 DESCRIPTION OF THE SYSTEM equalization  equitably  spreads  the  pixels  of  an  image 
among  the  grey‐level  intensities  thereby  increasing  the 
The system is implemented using MATLAB 7.0 and visu-
al studio NET and window size 20*20[10]. The proposed contrast of the image. 
face detection system is based on using the MRC tech-   
nique [8] cascaded with the MLP neural network tech- In preprocessing stage, the two classes both “non‐faces” 
nique [16] to achieve better detection rate. The reason of and “faces” were first subjected to the process of resizing 
choosing MRC cascaded with MLP Neural Network is by 20x20 as example and the size will considered to give 
that the MRC is fast, easy to implement, very accurate
us the best detection in the following stages. So each im‐
classifier. The neural network used in this system is mul-
ti-layer perceptron (MLP) neural networks. age  in  the  “non‐face”  and  “face”  class  was  first  resize  to 
20x20  sizes.  Then  with  the  use  of  standard  histogram 
Fig.1 shows our cascaded system. The proposed system equalization algorithm each image of both “non‐face” and 
consists of 3 stages “face”  classes  were  histogram  equalized  in  order  to  cor‐
1. The preprocessing stage: receives as input a 20 * 20 rect brightness, contrast and equalize the different intensi‐
pixel region of the image and generates an output
ties  level  of  the  image.  After  the  histogram  equalization, 
ranging from 1 to -1, signifying the presence or ab-
sence of a face, respectively. the  technique  of  gray  scale  is  applied  on  the  images  of 
2. MRC stage: receives the window resulted from part 1 both the classes in order to convert their color levels (i.e. 
after preprocessing to reject the non-faces. RGB) to gray level.  
MLP neural network stage: receives the expected faces Facial feature extraction stage, 20x20 sized small images 
window from stage 2 and pass it to the MLP neural net-
are extracted from the reduced sized image and fed to the 
work to decide whether the window contains face or not.
maximal rejection classifier (MRC) and then the target we 
do  some  operations  using  the  MRC  and  pass  the  target 
section to the  neural network. In face detection, Majority 
of  the  proposed  methods  use  image  gray  level  values  to 
detect faces in spite of the fact that most images today are 
color to solve these problems, our approach will deal with 
face detection methods in a color image. 
After we do the preprocessing stage on the image we pass 
it  to  the  MRC  network,  the  classifier  (MRC)  is  used  to 
explore face candidate locations and try to filter as many 
Fig.1 Cascading MRC and NN  non‐face patterns as possible before passing hard patterns 
to the final stage classifier; it means false rejection must be 
2.1 Cascading MRC and NN avoided at any cost. Because of the property of MRC, we 
Before  fed  the  input  image  to  the  system  as  we  see  in  can  avoid  false  rejection  as  long  as  we  give  MRC  the 
fig.2, our face detection system does the following stages:  enough  training  data  in  learning  stage.  In  other  words, 
 Image enhancement  the training set must be ideally and identically distributed 

in the sample space.   finding faces. As shown in fig.3. 
The  last  stage  classifier  is  the  MLP  neural  network  clas‐
sifier  that  has  been  used  extensively  in  classification  and 
regression.  The  expected  face  window  resulted  from  the 
MRC  classifier  passed  to  the  MLP  classifier  to  decide 
whether the window contains a face. The employed mul‐
tilayer  feed  forward  neural  network  consists  of  neurons 
with a sigmoid activation function. It is used in two mod‐ Fig. 3 Maximal Rejection Classifier ʺ1ʺ 
es. In classification mode, it is presented at the input layer  After that we take only the remaining non‐face part in the 
and propagated forward through the network to compute  dark green and finding theta two, it means that we must 
the  activation  value  for  each  output  neuron.  The  second  find  another  theta  and  two decision  levels  that  the  num‐
mode is called the training or learning mode. Learning in  ber of rejected non‐faces is maximized while finding faces 
ANN  involves  the  adjustment  of  the  weights  in  order  to  as shown in fig.4. 
achieve  the  desired  processing  for  a  set  of  learning  face   
samples.  More  specifically,  the  second  mode  includes 
feeding a neural network with a number of training pairs. 
Then  the  networks  parameters  are  adjusted  through  a 
supervised training algorithm. 
2.2 MRC Face Detection 
The  Maximal  Rejection  Classifier  (MRC)  is  a  linear  clas‐
sifier that overcomes the two drawbacks. While maintain‐
Fig.4 Maximal Rejection Classifier ʺ2ʺ 
ing  the  simplicity  of  a  linear  classifier,  it  can  also  deal 
with  non‐linearly  separable  cases.  The  only  requirement 
The first stage in the MRC is to gather two example‐sets, 
is  that  the  Clutter  class  and  the  Target  class  are  disjoint. 
Faces  and  Non‐Faces.  Large  enough  sets  are  needed  in 
MRC  is  an  iterative  rejection  based  classification  algo‐
order  to  guarantee  good  generalization  for  the  faces  and 
rithm.  The  main  idea  is  to  apply  a  linear  projection  fol‐
the non‐faces that may be encountered in images. 
lowed by a threshold (in each iteration). However, as op‐
posed to these two methods, the projection vector and the 
The target class should be assumed to be convex in order 
corresponding  thresholds  are  chosen  such  that  at  each 
for the MRC to perform well. The target class containing 
iteration.  MRC  attempts  to  maximize  the  number  of  re‐
frontal  faces.  One  can  easily  imagine  two  faces  that  are 
jected Clutter samples. This means that after the first clas‐
not  perfectly  aligned,  and  therefore,  when  averaged, 
sification  iteration,  many  of  the  Clutter  samples  are  al‐
create a new block with possibly four eyes, or a nose and 
ready classified as such, and discarded from further con‐
its echo, etc. To our help comes the fact that we are using 
sideration.  The  process  is  continued  with  the  remaining 
low‐  resolution  representation  of  the  faces  (20  *  20  pix‐
Clutter  samples,  again  searching  for  a  linear  projection 
els).This  implies  that  even  for  such  misaligned  faces  ,the 
vector  and  thresholds  that  maximizes  the  rejection  of 
convex average appear as a face, and thus our assumption 
Clutter points from the remaining set. This process is re‐
regarding  convexity  of  the  faces  class  is  valid.  The  Non‐
peated  iteratively  until  convergence  to  zero  or  a  small 
Face  set  is  required  to  be  much  larger,  in  order  to 
number  of  Clutter  points.  The  remaining  samples  at  the 
represent the variability of Non‐Face patterns in images. 
final stage are considered as targets. 
  2.2.1Rejection on MRC
Now,  that  we  have  segmented  every  image  into  several  The rejection in MRC is as follow: 
segments  and  approximated  every  segment  with  a  small   Build a combination of classifiers: we must find theta 
number  of  representative  pixels,  we  can  exhaustively  and  two  decision  levels  that  the  number  of  rejected 
search  for  the  best  combination  of  segments  that  will  re‐ non‐faces is maximized while finding faces. 
ject the largest number of non‐face images. We repeat this   Apply  the  weak  classifiers  sequentially  while  reject‐
process until the improvement in rejection is negligible. It  ing  non‐faces:  it  means  that  we  must  find  another 
means  that  we  must  find  theta  and  two  decision  levels  theta  and  two  decision  levels  that  the  number  of  re‐
that the number of rejected non‐faces is maximized while  jected  non‐faces  is  maximized  while  finding  faces  as 

shown in the fig.5.  system is included detailing the operation of the detector. 
Fig.5 Maximal Rejection Classifier All Stages. 
2.3 The MLP neural network classifier 
Fig.6 The MLP Network 
Fig.6 shows the MLP neural network implemented in our 
face  detection  system.  It  is  composed  of  3  layers,  one  in‐
put,  one  hidden  and  one  output.  The  input  layer  consti‐ 3 EXPERIMENTAL RESULTS
tutes  of  400  neurons  (20*20)  which  receive  pixel  binary  The face detection system presented in this paper was
data  from  a  20x20  symbol  pixel  matrix.  The  size  of  this  developed, trained, and tested using MATLAB™ 7.0, vis-
matrix was decided taking into consideration the average  ual studio .NET 2003 on the Intel Pentium (4) 3.20 GHz
1.00GB of RAM and Windows XP operating system.
height and width of character image that can be mapped 
We divide our system into two parts; first is training and
without introducing any significant pixel noise.  second is testing.
The hidden layer constitutes of 300 neurons whose num‐ Training: Each stage in the cascade was trained using a
ber  is  decided  on  the  basis  of  optimal  results  on  a  trial  positive set, a negative set. In order to train the Face De-
and error basis. The output layer is 1 neuron. To initialize  tection System to figure out the face and non-face images.
We need to pass a collect (bunch) of face and non-face
the weights a random function was used to assign an ini‐
training data to give out 1 for face and -1 for non-face.
tial random number which lies between two preset integ‐ Initially our face training set contains face images col-
ers named bias. The weight bias is selected from trial and  lected from MIT face databases. Our face detection system
error  observation  to  correspond  to  average  weights  for  has been applied to several test images (faces were fron-
quick convergence.  tal). All images were scaled to 20x20 pixels, and satisfac-
  tory results have been obtained. The test set consists of a
total of 2000 face and non-face images given in the MIT
After the image data enters the input node, it is calculated 
database. Non-face patterns are generated at different
by  the  weights.  Face  or  non‐face  data  is  determined  by  locations and scales from images with various subjects,
comparing output results with the thresholds. For exam‐ such as rocks, trees, buildings, and flowers, which contain
ple,  if  the  output  is  larger  than  the  threshold,  it  is  consi‐ no faces.
dered as a face data. 
For the training images, because of the lighting, shadow,
contract variances, we need to equalize those differences.
A  problem  that  arises  with  window  scanning  techniques  The linear fit function will approximate the overall
is  overlapping  detections.  Deals  with  this  problem  brightness of each part of the window.
through two heuristics:   To train the cascade of classifiers of the rejection stage to
 Thresholding:  the  number  of  detections  in  a  small  detect frontal upright faces, the same face and non-face
region  surrounding  the  current  location  is  counted,  patterns are used for all stages. The neural net is trained
for 191 epochs as shown in fig.7.
and if it is above a certain threshold, a face is present 
at this location.  
 Overlap  elimination:  when  a  region  is  classified  as  a 
face  according  to  thresholding,  then  overlapping  de‐
tections are likely to be false positives and thus are re‐
Windows  thought  to  contain  a  face  are  outlined  with  a 
bounding  box  and  on  completion  a  copy  of  the  image  is 
displayed,  indicating  the  locations  of  any  faces  detected. 
In  the  next  section  a  more  thorough  description  of  the  Fig.7 Trained Epochs.
System Testing: Firstly, we compare the first classifier Table1 MRC Testing Results
learned using MRC algorithm on three images; secondly Image Detection Average False False
we compare those images on the all stages on our system. # Int. % Time Positive Negative
Finally, we compare our face detection system with oth- 1 92 17 sec. 0 2
ers. 2 100 34 sec. 0 0

This experiment compares single strong classifier learned Table 2: MRC-MLP Testing Results
using MRC algorithm in the classification performance.
The database consists of two images as shown in fig.8.
Image Detection Average False False
The first image contains 8 frontal faces and the second
# Int. % Time Positive Negative
image contains 11 frontal faces. The system has been
1 100 19 sec. 0 0
tested using the MRC classifier alone. Then, the system
2 100 42 sec. 1 0
has been tested using the MRC-MLP cascaded face detec-
tion system. The images below have complex background
and occlusion faces and it have different shapes and dif-
Table 3 shows the comparison of the proposed system
with well-known classifiers used in the literature. This
comparison shows that our system gives good results.
The results shows that the MRC-MLP classifier has detec-
tion rate more than SVM and MLP classifiers and an error
rate (False positive and false negative) less than Adaboost

Table 3 Comparison with other systems

Fig.8 Test Images
Classifier Detection Error
Fig.9 and table 1 shows the MRC test can detect more Rate Rate
than 90% of faces in all the images and has an average
time 20 seconds. Fig.10 and table 2 shows the result of the Multiple SVM 88.14% 5.86
second test using the cascaded MRC-MLP system was MLP 91.25 8.75
100% detection rate and acceptable average time. As a
result of this test, the proposed system increases the de- AdaBoost 99.78 .22
tection rate. MRC&MLP 91.6 7.54

The results obtained in table 3 as a result of testing our

system in the MIT database. The database test set consists
of a total of 2000 face and non-face images. Non-face pat-
terns are generated at different locations and scales from
images with various subjects, such as rocks, trees, build-
ings, and flowers, which contain no faces. Fig.11 and
fig.12 shows examples from the faces and non-faces train-
Fig.9 Output of the MRC alone ing set respectively.

Fig.11 Faces training set

Fig.10 Output of the MRC-MLP cascaded system

Trans. on Pattern Analysis and Machine Intelligence,
Vol.26, 2004.
[10] Catalin-Daniel, Corina Botoca, C# Solutions for a
Face Detection and Recognition System, FACTA UN-
IVERSITATIS ,vol. 20, no. 1, l 2007.
[11] Feraund, R.; Bernier, O.J.; Viallet, J.-E.; Collobert, M.
“A fast and accurate face detector based on neural
networks”, IEEE, Volume 23, Issue 1, Jan 2001
Page(s):42 – 53.
[12] F. Zuo and P. H. N. de With, “Fast face detection us-
ing a cascade of neural network ensembles,” in Pro-
ceedings of the 7th International Conference on Ad-
vanced Concepts for Intelligent Vision , vol. 3708 of
LNCS, pp. 26–34, September 2005.
[13] Kevin C., Xuelong L. and Neil M.,The Use of Neural
Fig.12 Non-Face training set. Networks in Real-time Face Detection, Journal of Com-
puter Sciences, 47-62, 2005.
[14] K.A. Ishak, S.A. Samad, A. Hussain and B.Y. Majlis,
A Fast And Robust Face Detection Using Neural Net-
In this paper we presented a new classifier for face detec- works, Vol.25, No.11, 2003, pp. 1767-1784.
tion based on MRC and MLP. The proposed approach [15] R. L. Hsu and M. A.-M. an A. K. Jain, Face detection
significantly improves the detection rate in comparison in color images, IEEE Trans. on Pattern Analysis and
with the standard MLP neural network systems. Our Machine Intelligence, 2002, pp. 696–706.
Algorithm can detect between 79% and 92% of faces [16] Mohammad Inayatullah, Shair Akbar Khan ,Bashir
from images of varying size, background and quality Ahmad. A Face Detection System using Neural Net-
with an acceptable number of false detections. A thre- work Approach”, IC-AI 2006: 753-758.
shold of between 0.5-0.6 gives the best range of results out [17] Marian, Milos, A System For Localization Of Human
of the threshold set tested. The new system was designed Faces In Images Using Neural Networks, Journal of
to detect upright frontal faces in color images with simple Electrical Engineering, Vol.56, NO.7-8, 2005.
or complex background [15]. There is no required a priori [18] Chiunhsiun Lin, Face detection in complicated
knowledge of the number of faces or the size of the faces backgrounds and different illumination conditions by
to be able to detect the faces in a given image. The system using YCbCr color space and neural network, , pattern
has acceptable results regarding the detection rate, false Recognition Letters, Vol.28, No.16, pp. 2190-2200.
positives and average time needed to detect a face.


[1] Aamer M., Ying W., Jianmin J. and Stan I., Face Detec-
tion based Neural Networks using Robust Skin Color
Segmentation, Amman, IEEE SSD 2008. 5th Internation-
al Multi-Conference, 20-22 July 2008, pp.1 – 5.
[2] Rowley, H.A., Baluja, S., and Kanade T., Neural net-
work-based face detection. Pattern Analysis and
Machine Intelligence, IEEE Transactions on, Vol.20, no.1,
1998, pp.23 – 38
[3] M-H Yang, D. Kriegman and N. Ahuja, Detecting Face
in Images:A Survey, IEEE Transaction on Pattern
Analysis and Machine Intelligence, vol. 24, no. 1, 2002,
pp. 34-58.
[5] Aouatif A., Sanaa G. and Mohammed R., Face Detec-
tion in Still Color Images Using Skin Color Informa-
tion, ISBN: 978-1-4244-1751-3 ,IEEE, April 2008.
[6] P. Viola and M. J. Jones, “Robust real-time face detec-
tion,” International Journal of Computer Vision, vol.
57, no. 2, pp. 137– 154, 2004.
[8] Elad, M.; Hel-Or, Y.; Keshet, R. ,Pattern detection us-
ing a maximal rejection classifier, pattern recognition
letters, Vol.23, 2001, pp. 1459-1471.
[9] Christophe Garcia and Manolis Delakis, A Neural Ar-
chitecture for Fast and Robust Face Detection, IEEE