A study of quantitative comparisons of photographs and video images based on

landmark derived feature vectors
Krista F. Kleinberg
a,
*, J. Paul Siebert
b
a
Forensic Medicine and Science, Joseph Black Building, University of Glasgow, Glasgow G12 8QQ, UK
b
Department of Computing Science, Sir Alwyn Williams Building, University of Glasgow, Glasgow G12 8QQ, UK
1. Introduction
As a result of the wide deployment of surveillance cameras,
there is both opportunity and motivation, given the amount of
visual material being collected digitally, to identify suspects from
CCTV. Although rapidly improving in terms of spatial resolution,
the majority of video surveillance equipment does not produce
images of sufficient quality needed to provide identifications when
other more conclusive evidence, such as DNA or fingerprints, is not
available. It is in these kinds of cases that anthropometry may have
the potential to provide a useful identification technique.
Surveillance video can be important supportive evidence because
it may show a crime being committed, although, it is not always
easy to recognise, and therefore convict, a criminal caught on CCTV.
Video surveillance can be more reliable than eyewitness testimony
because the story told is always consistent and also corroborates
what the eyewitness reported [1]. However, a more comprehen-
sive analysis is necessary because even when facial video images
are of sufficient quality, it is possible that two people may look
similar to each other in this medium.
The roles of anthropometry and forensic science have inter-
twined beginning with Bertillon in the 1800s [2,3] and anthro-
pometry was one of the identification methods used in [4–6].
Although more sophisticated vision based methods of image
comparison are being developed [7,8], it remains to be seen what
can be achieved by utilizing ratios between key facial landmarks on
single 2D images. Even if reliable automatic methods for face
image comparison can be developed, the need for manual
intervention in terms of landmark placement are likely to be
required where low-quality images have to be analysed, such as
generated by many currently installed CCTV systems. In contrast to
comparing two images, anthropometric proportions from the face
and body of live suspects were compared against 2D images and
was one of the identification methods resulting in convictions in
two out of three cases in Halberstein’s 2001 paper [9]. One of the
fundamental problems with comparing 2D images is facial pose.
Forensic Science International 219 (2012) 248–258
A R T I C L E I N F O
Article history:
Received 6 July 2011
Received in revised form 22 November 2011
Accepted 4 January 2012
Available online 24 January 2012
Keywords:
Facial identification
Anthropometry
Image comparison
Face database
A B S T R A C T
An abundunce of surveillance cameras highlights the necessity of identifying individuals recorded.
Images captured are often unintelligible and are unable to provide irrefutable identifications by sight,
and therefore a more systematic method for identification is required to address this problem. An
existing database of video and photograhic images was examined, which had previously been used in a
psychological research project; material consisted of 80 video (Sample 1) and 119 photograhic (Sample
2) images, though taken with different cameras. A set of 38 anthropometric landmarks were placed by
hand capturing 59 ratios of inter-landmark distances to conduct within sample and between sample
comparisons using normalised correlation calculations; mean absolute value between ratios, Euclidean
distance and Cosine u distance between ratios. The statistics of the two samples were examined to
determine which calculation best ascertained if there were any detectable correlation differences
between faces that fall under the same conditions. A comparison of each face in Sample 1 was then
compared against the database of faces in Sample 2. We present pilot results showing that the Cosine u
distance equation using Z-normalised values achieved the largest separation between True Positive and
True Negative faces. Having applied the Cosine u distance equation we were then able to determine that
if a match value returned is greater than 0.7, it is likely that the best match will be a True Positive
allowing a decrease of database images to be verified by a human. However, a much larger sample of
images requires to be tested to verify these outcomes.
ß 2012 Elsevier Ireland Ltd. All rights reserved.
* Corresponding author. Present address: PEACH Unit, University of Glasgow,
Queen Mother’s Hospital, 8th Floor – Tower Block, Dalnair Street, Glasgow G3 8SJ,
UK. Tel.: +44 141 201 1988; fax: +44 141 201 6943.
E-mail addresses: Krista.Kleinberg@glasgow.ac.uk, kristakleinberg@yahoo.com
(K.F. Kleinberg), Paul.Siebert@glasgow.ac.uk (J.P. Siebert).
Contents lists available at SciVerse ScienceDirect
Forensic Science International
j ou r nal h o mepage: w ww. el sevi er . co m/ l oc at e/ f o r sc i i nt
0379-0738/$ – see front matter ß 2012 Elsevier Ireland Ltd. All rights reserved.
doi:10.1016/j.forsciint.2012.01.014
Attempts to rectify this in facial recognition pose invariant
systems described in [10,11] reported greater recognition rates
than when used without the pose transformations. Using soft
biometric traits was shown to be beneficial in improving
recognition accuracy when combined with a commercial based
face matching program [12].
Three questions should be asked of a comparison method; is it
possible to carry out the comparison objectively, is it possible to
avoid manual input, and is it applicable to checking large
databases? An identification made based on 2D images will be
more decisive if there is a way to quantify the comparison, rather
than if the identification is based solely on a subjective analysis, as
the result is a comparison that is objective with minimal bias. Once
quantification of a comparison is achieved, the process should be
automated. An automated process would decrease the error from
involving many different operators in the comparison process and
would allow large databases to be checked quickly. As a face search
could potentially be extended to full populations by reviewing
internationalised databases, i.e. Interpol [13], the need for
automation is high. According to The Ministry of Justice Statistics
(UK) bulletin, the reoffending rate for criminals in England and
Wales in 2006 was 146.1 offences per 100 offenders [14]. Although
this is a decrease of 22.9% from 2000, the numbers indicate there is
justification for a database of convicted criminal images that could
be quickly automated and checked.
We document an investigation into the comparison of
anthropometric ratios of facial landmark pairs manually located
on 2D images. The constraints in this study are that we consider
best-case scenario situations as a bench mark given that scenarios
in the field, by definition, cannot be as benign. The subject matter is
based on the analysis of comparing high quality full-face frontal
video and photographic images of individuals of a similar ethnic
background with neutral expressions.
This investigation expanded previous research carried out by
Kleinberg, Vanezis and Burton [15] and was conducted to test the
hypothesis: ‘‘Using a comparison of anthropometric facial ratios, it
is possible to discriminate between individuals of two samples.’’
The objective of this study was to derive measurements between
specific landmarks on the face in both print and video media and
incorporate them into a feature vector to use in statistical analysis
to determine if identifications of an individual can be made based
on these measurements. Knowledge of the type of information
gathered in this study may help in future to rank potential suspects
for human identification verification. However, in order to
establish that two faces were the same and use this identification
method to identify positively rather than eliminate suspects, it
would be necessary to show that the probability of a false match in
the rest of the population at random was of an acceptably low
probability [16].
To investigate the hypothesis in this study, we seek to address
the following questions:
Of the proposed images, can similar faces be separated from
dissimilar faces within a single sample using vector compar-
isons?
How distinguishable are individual faces in the samples? Is it
possible to distinguish true positive faces from true negative
faces using vector comparisons where the statistics from two
samples are known?
Using a small sample of re-landmarked images, how signifi-
cant is the error contribution in re-landmarked images and
what is the operator induced measurement spread under ideal
conditions?
Given a specific example and set of comparisons with the
database, what constitutes a manageable subsample, worthy of
further manual verification?
2. Materials
A total of 199 images of Caucasian male police volunteers were available which
had been used previously in research conducted by Bruce et al. [17]. The 199 images
comprised 80 different video still faces (Sample 1) and 119 different photographic
faces (Sample 2). According to Bruce et al. [17], ‘‘The image quality on the videos
was high-equivalent to what would be produced by a good amateur photographer
trying to reveal a good likeness of someone on a home videotape.’’ The photographic
images in Sample 2 included the same 80 faces depicted in the video cohort, and an
additional 39 new faces not included as video stills. The photographs were of
policemen, both retired and presently working and except for photographs, which
have already been published elsewhere are, for this reason, unable to be exhibited
in this paper. However, an example of each type of image is provided in Fig. 1. Both
sets of images, taken on the same day, were displayed from the frontal viewpoint,
showing features from the neck up, in what appeared to be the format of police
identification photographs. In this study the identity of the subjects in the video
images was known and could be cross referenced with the corresponding
photographic images. This means that identifications made on the basis of facial
anthropometry could be designated as true or false. One positive feature of these
video images was that because they were recorded on the same day as the
photographs, the study images did not have any of the possible facial changes which
can occur due to time factors such as weight loss/gain, increase in age or presence of
facial hair.
3. Methodology
Given a set of landmarks there is a need to be able to quantify the landmarks
numerically such that they can be used to compare faces. Ideally, the measure
should be invariant to in-plane translations and rotations and be tolerant to a
degree of out-of-plane rotation in order to accommodate the variability inherent
when posing a subject for full frontal image capture. Thirty-eight landmarks (Table
1), ten unilateral and 14 bilateral, were chosen for inclusion in the anthropometric
study and are shown in Fig. 2. Careful consideration was given to the selection of
landmarks that were used is this study. Anthropometric research by Farkas [18],
Purkait [19], Fieller [20], Evison [21], and facial recognition research by Craw et al.
[22] and Okada et al. [23] were consulted when choosing the landmarks that were
included in the present study. When choosing a landmark it was important that it
was one that could be placed consistently. It had to be a point where an operator
performing the comparison would be able to locate it in the same place within an
acceptable error. According to Fieller [20], the criteria used to determine a
successful/reliable landmark are: observer knowledge, consistency of landmark
Fig. 1. High resolution video image (a) and selection of ten database photographs (b).
K.F. Kleinberg, J.P. Siebert / Forensic Science International 219 (2012) 248–258 249
placement, discriminatory power, and landmark visible in majority of cases.
Excluded landmarks were eliminated on the basis of their inability to be located on
photographs.
Although the number of possible linear measurements increases combinatorially
with the number of landmarks, not all are reliable or pertinent to the research
undertaken for this study. A total of 73 linear measurements (21 unilateral, 26
bilateral) were chosen for this study. The majority of these were chosen by
consulting the literature [18,22,24]. Two of these measurements used in a previous
study [25], ex-n and ex-sto, were chosen because they utilise landmarks that were
considered to be less affected by facial expression than others and also because they
would be visible even if the subject was wearing a hat. Three bilateral
measurements were unique to the present study.
From these landmarks and linear measurements, a total of 59 ratios (also
unilateral and bilateral) were selected for comparison of images (Table 2). The
linear measurements that make up the ratios are shown in Fig. 3. A ratio was
derived by dividing the smaller linear measurement (numerator) by the larger
linear measurement (denominator). The ratios were chosen to achieve a balance of
the horizontal and vertical regions of the face. Intuitively, it is expected that longer
lines between landmarks located on different sections of the face would make a
more reliable proportion than two short lines in the same section of the face. This is
because small variations in landmark placement making up short lines would result
in large changes in proportions, which may not accurately portray true variations
between individuals. The ratios utilised in this research were deliberately chosen to
include linear measurements between landmarks in different sections of the face
and others that covered a small section of the face, such as the length vs. the width
of the eye. As it is more common to use absolute measurements in anthropometric
comparisons [18,19,26,27] rather than ratios, there was less guidance with respect
to which ratios would be more reliable or more relevant than others in the present
study. Halberstein used a combination of up to twelve face and body ratios when
comparing a photograph to a live subject, and three of these ratios were used [9].
These ratios were ear length/facial height (sa-sba/n-gn), nasal height/ear length (n-
sn/sa-sba) and nasal width/nasal height (al-al/n-sn). The remainder of the ratios
that were used by Halberstein were not incorporated into this research because
they either included facial landmarks that were not chosen for the present study or
Table 1
Landmarks and their definitions used in this study [18,24].
1. Glabella (g): the most prominent midline point between the eyebrows.
2. Nasion (n): the point in the midline of both the nasal root and the
nasofrontal suture. This point is always above the line that connects the
two inner canthi. A canthus is the angle at either end of the fissure
between the eyelids.
3. Exocanthion (ex): the point at the outer commissure of the eye fissure. A
commissure is the site of union of corresponding parts and a fissure is any
cleft or groove, in this case of the eye [bilateral].
4. Endocanthion (en): the point at the inner commissure of the eye fissure
[bilateral].
5. Palpebrale superius (ps): highest point in the midportion of the free
margin of each upper eyelid. The free margin portion of the eyelid is the
unattached edge [bilateral].
6. Palpebrale inferius (pi): the lowest point in the midportion of the free
margin of each lower eyelid [bilateral].
7. Orbitale (or): the lowest point on the margin of the orbit. The orbit is the
bony cavity that contains the eyeball [bilateral].
8. Superaurle (sa): the highest point of the free margin of the auricle. The
auricle is the portion of the external ear that is not contained within the
head [bilateral].
9. Subaurale (sba): the lowest point on the free margin of the ear lobe
[bilateral].
10. Postaurale (pa): the most posterior point on the free margin of the ear
helix. The helix refers to the coiled structure of the ear. [bilateral].
11. Otobasion inferius (obi): the lowest point of attachment of the external
ear to the head [bilateral].
12. Alare (al): the most lateral point on each nostril contour [bilateral].
13. Subnasale (sn): the midpoint of the angle at the columella (fleshy, lower
margin) base where the lower border of the nasal septum and the surface
of the upper lip meet.
14. Pronasale (prn): the most protruded point of the nasal tip.
15. Subalare (sbal): the point on the lower margin of the base of the nasal
ala where the ala disappears into the upper lip skin [bilateral].
16. Stomion (sto): the imaginary point at the crossing of the vertical facial
midline and the horizontal labial (lip) fissure between gently closed lips,
with teeth shut in the natural position.
17. Crista philtri landmark (cph): the point on the elevated margin of the
philtrum just above the vermilion line. The philtrum is the vertical
groove in the median portion of the upper lip and vermilion refers to the
exposed
red portion of the upper or lower lip [bilateral].
18. Cheilion (ch): the point located at each labial commissure [bilateral].
19. Labiale inferius (li): the midpoint of the vermilion border of the lower lip.
20. Labiale superius (ls): the midpoint of the vermilion border of the upper
lip.
21. Gonion (go): the most lateral point at the angle of the mandible. The
mandible is the bone of the lower jaw [bilateral].
22. Sublabiale (sl): determines the lower border of the lower lip or the upper
border of the chin.
23. Pogonion (pg): the most anterior midpoint of the chin.
24. Gnathion (gn): the lowest point in the midline on the lower border of the
chin.
Fig. 2. Facial landmarks and their location.
Table 2
Ratios used in this study.
go-go/n-gn sn-sto/sto-sl sn-gn/n-sto li-sl/sn-ls
n-prn/g-pg sbal-sn/sn-prn [bilateral] gn-go/n-gn [bilateral] sl-gn/sto-gn
al-al/ex-ex ex-go/go-go [bilateral] al-al/n-sn n-sn/n-sto
sa-sba/n-gn [bilateral] n-gn/n-sto n-sn/sa-sba [bilateral] en-al/ex-ch [bilateral]
ex-ex/go-go obi-ch/g-sa [bilateral] ex-n/ex-sto [bilateral] sbal-ls/n-al [bilateral]
ex-n/n-sto [bilateral] pi-al/sa-ex [bilateral] ex-sto/n-sto [bilateral] ex-obi/ex-ch [bilateral]
en-ex/ps-pi [bilateral] ex-al/ch-gn [bilateral] en-en/ex-ex ch-ls/n-prn [bilateral]
pi-or/en-ex [bilateral] al-ls/ch-gn [bilateral] sa-sba/pa-obi [bilateral] ch-li/ex-ch [bilateral]
cph-cph/sn-ls ex-sto/rt ex-lt ch [bilateral] ls-sto/ch-ch sn-gn/ex-gn [bilateral]
sto-li/ch-ch
K.F. Kleinberg, J.P. Siebert / Forensic Science International 219 (2012) 248–258 250
because they were body ratios, such as shoulder width, leg or shoe lengths. Two
ratios (n-sn/n-sto, n-gn/n-sto) were used by Catterick for his research [28]. The
remainder of the ratios chosen were unique to this study. In order to continue with a
best case scenario situation, one volunteer, with previous experience in placing
landmarks on 2D images, placed the 38 landmarks on all 199 images using the
measurement programme produced in-house, Facial Identification Centre Version
0.32
ß
Forensic Medicine and Science Glasgow University.
The group of 59 ratios is treated as a 59 dimensional vector and this has been
evaluated as a means of comparing all faces. In this study, the feature vector is the
series of 59 ratios derived from chosen linear measurements between facial
landmarks. The alternative for comparing ratios between landmarks is to compare
the raw distances between the landmarks. Comparing raw distances can be
accomplished using the Procrustes [29] alignment techniques, and although
outside of the scope of the current project, may be used in future studies. The
advantage of using ratios is that they are both scale and rotation invariant and also
to a slight degree auto-corrective (in terms of errors added during landmarking). In
addition, ratios exhibit a degree of invariance to the effects of out-of-plane rotations
for small angles (when the effects of such rotations are sufficiently small to
approximate a 2D affine transformation on the imaging plane).
Three equations were used to test the comparison of a feature vector from one
sample against another; mean absolute difference, Euclidean distance and Cosine u
distance. The first two equations compare the length of the difference vector and
the third equation compares the angle between the vectors. The three equations are
as follows:
3.1. The mean absolute difference between ratio vectors
Eq. (1) determines the distance that separates one face from another by taking
the absolute value of one face ratio vector subtracted from the same ratio of a
second face. This is carried out for each ratio element in the feature vector. The
summation of this feature vector is then divided by the total number of elements
(59 ratios in this case). A difference of 0 between two faces establishes that those
two faces have identical facial ratio vectors. The smaller the difference in facial
ratios is indicative of a smaller difference between faces. A disadvantage of using
this equation is that the maximum difference between faces is not bounded:
Mean abs diff ¼
X
n¼N
n¼1
F
1
ðnÞ À F
2
ðnÞ j j
N
(1)
3.2. The Euclidean distance between ratios
The Euclidean distance (Eq. (2)) also measures the distance between two multi-
dimensional vectors. This is the square root of the sum of the squares of the
elements, in this case ratios. A difference of 0 between two faces establishes that
those two faces have identical facial ratio vectors. The smaller the difference in
facial ratios is indicative of a smaller difference between faces. A disadvantage of
using this equation is that the maximum difference between faces is not bounded:
Euclidean distance ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
X
n¼N
n¼1
F
1
ðnÞ À F
2
ðnÞ ð Þ
2
v
u
u
t
(2)
3.3. The Cosine u distance
The Cosine u distance equation (Eq. (3)) is a similarity measurement and is used
to measure the angle between two vectors. A cosine difference of 1.0 between two
faces establishes that those two faces have identical vectors of facial ratios. An
advantage of using this equation is that the range of values is bound from À1.0 to
+1.0 and useful comparisons are ranged from zero to one. A difference of zero is
indicative of a face that shows no correlation whereas a result of 0.5 is achieved by
random chance. Any negative result shows the face comparison produces an inverse
correlation:
Cos u ¼
X
n¼N
n¼1
F
1
ðnÞ Ã F
2
ðnÞ
F
1
k k F
2
k k
(3)
A comparison between two faces was deemed a true positive match (TP) if the
match was a correct match between the video image and photograph of the same
subject. A true negative match (TN) was one that excludes the faces and which was a
correct exclusion because it involved a video image and a photograph of two
different subjects. A false positive match (FP) was one which was an incorrect match
between a video image and a photograph of two different subjects and a false
negative match (FN) was one which is excluded but which was an incorrect
exclusion because it involved the video image and photograph of the same subject.
To answer the questions laid forth in Section 1, the three equations were applied
in the following four scenarios to test the comparison of Sample 1 faces to Sample 2
faces; within sample comparisons, between sample comparisons, error in landmark
placement, and the potential sample of photographs subject to manual verification.
4. Results
4.1. Within sample comparisons
To test if similar faces were separable from dissimilar faces
within a single sample the equations were applied so that every face
in a single sample was compared to itself and everyother face within
this sample. Each sample contained only one image of each face and
for this reason all that could be determined was the true negativity of
this collection of different faces. Therefore, no estimate of the degree
to which two same faces (true positives) would match when
captured at different times could be made from this data. The same
tests were carried out on Sample 1 (video) and then separately on
Sample 2 (photographs). Testing all combinations of pairs of faces
within each sample was important because it compared faces
acquired under the same capture conditions, allowing the tests to
ascertain if it were possible to discriminate between different (true
negative) faces. Therefore, in this experiment the primary source of
variability between faces should be attributable to differences in the
measured facial landmark ratios, i.e. generated by genuine face
shape differences, whilst the statistics of the remaining sources of
variability remain constant; same media, same operator placing
landmarks and same facial pose.
The similarity or dissimilarity between the faces in a single
sample is cross-checked by comparing the distributions of the
similarity statistics of Sample 1 to those of Sample 2. A Sample 1 to
Sample 2 cross-comparison of the statistics produced by matching
faces within their own samples can be used to in future to predict
how discriminable faces are when making comparisons between
these two samples. If both samples exhibit similar statistics, this
would be indicative that it is possible to distinguish faces between
Fig. 3. Linear measurements that created the ratios utilised in this study.
K.F. Kleinberg, J.P. Siebert / Forensic Science International 219 (2012) 248–258 251
samples because any difference between faces would be a result of
the true difference in faces rather than a result of the different
media recording each of the two samples of images. Results are
summarised in Fig. 4a–c and are illustrated by superimposing the
normal distribution curves and similarity density histograms of
the two samples. In Table 3a the standard deviation scaled
difference between the Sample 1 and Sample 2 means indicates the
difference in statistics between media, the cosine distance by far
exhibiting the greatest difference.
In order to equalise the absolute ranges of the feature vector
values in each sample and address the observed difference in the
statistics of the comparisons of Sample 1 and Sample 2, the
equations were completed using the application of Z-normalised
ratio values, illustrated in Fig. 4a–c. Each element, F(n) in the
measurement vector F is expressed by a population of measure-
ment ratios within a sample, Z-normalisation potentially enhances
range of variation (and accentuates any differences) about the ratio
mean of this sub-population of ratios, allowing small differences in
the data to become more apparent. This was accomplished by
dividing the mean subtracted element by the sample standard
deviation for the particular ratio (Eq. (4)). Z-normalisation can be
applied to all three of the equations:
Z-normalized element ¼ F
Z
ðnÞ ¼
FðnÞ À m
FðnÞ
s
FðnÞ
(4)
Therefore, by applying Z-normalisation it becomes possible to
force the distributions into a standardised range. Taking the Z-
normalised cosine distance as an example, the result of this
process, driving the means and standard deviations together and
reducing difference between the variance-scaled means is
illustrated in Table 3b.
4.2. Between sample comparisons
Results from conducting the equations were illustrated using
distribution histograms, separating TP faces from TN faces, Fig. 5a–
c. These normal histogram distribution curves of TP faces and TN
faces were superimposed to determine if it was possible to
distinguish between faces in the two groups. The amount of
overlap shows the possibility of achieving either a FP or FN face
match, also known as the rate of misclassification. The smaller the
area, the smaller the chances of obtaining a FP or FN face match.
In the graphs, TP face matches are represented by the dotted
lines and the solid lines represent TN face matches. In order to
ensure equal numbers of faces in the two samples, the 39 faces in
Sample 2 that were not in Sample 1 were not included in this
Table 3b
Mean, standard deviation, and standard deviation scaled difference between the Sample 1 and Sample 2 means of Z-normalised: cosine distance for Sample 1 and Sample 2.
Comparison method Sample Sample mean m Sample standard deviation (SD)
m
Sample
SD
Sample
m
Sample 1
SD
Sample 1
À
m
Sample 2
SD
Sample 2

N, number of samples
Z-normalised Cos(u) 1 À0.0113 0.2460 À0.04594 0.01474 3160
Z-normalised Cos(u) 2 À0.0077 0.2468 À0.0312 7021
Table 3a
Mean, standard deviation, and standard deviation scaled difference between the Sample 1 and Sample 2 means of unnormalised: mean absolute distance (MAD), Euclidean
distance and cosine distance for Sample 1 and Sample 2.
Comparison method Sample Sample mean m Sample standard deviation (SD)
m
Sample
SD
Sample
m
Sample 1
SD
Sample 1
À
m
Sample 2
SD
Sample 2

N, number of samples
MAD 1 0.08354 0.02316 3.607 0.561 3160
MAD 2 0.08507 0.02041 4.168 7021
Euclidean distance 1 1.015 0.4121 2.463 1.042 3160
Euclidean distance 2 1.149 0.3278 3.505 7021
Cos(u) 1 0.9859 0.01314 75.03 74.955 3160
Cos(u) 2 0.9887 0.006592 149.985 7021
Fig. 4. (a–c) Summary of the conditions imposed and results achieved in the within
sample comparisons of faces. Histograms and superimposed mean and standard
deviation of unnormalised: mean absolute distance (a), Euclidean distance (b) and
cosine distance (c) within sample comparisons for Sample 1 (dotted lower line) and
Sample 2 (solid upper line).
K.F. Kleinberg, J.P. Siebert / Forensic Science International 219 (2012) 248–258 252
analysis and every face in Sample 1 was compared to every face in
Sample 2.
The mean absolute difference, the Euclidean distance and the
Cosine u distance equations (all distance measures Z-normalised)
were applied in the between sample comparisons. Superimposed
normal histogram distribution curves of TP and TN face matches
were used to illustrate the discrimination between the two groups.
In general, a slightly narrower distribution was seen for the TP
faces. This was most likely because the distribution contained only
TP matches and therefore the data should be centred on a smaller
range of values. The amount of overlap between the TP and TN face
matches correlated to the possibility of achieving either a FP or FN
face match.
Superimposing the normal curves to demonstrate the separa-
tion between TP and TN face matches, the Cosine u distance (Z-
normalised) equation produced the smallest amount of overlap
and of the three equations conducted was determined to be best
equation to test the discrimination between faces of two samples.
Examination of the superimposed curves showed approximately a
30% chance of the best match between compared faces corre-
sponding to a correct identification. The TP distribution is very
small and difficult to see on the graphs. However, it is still possible
to see the 0.7 threshold emerging for the Cosine u distance with
careful observation of Fig. 5c.
Table 4 illustrates that following Z-normalisation, the cosine
distance provides the greatest separation of TP comparisons from
the TN comparisons, based on the difference between the TP and
TN standard deviation scaled means, respectively. The conclusion
made from this investigation was that the cosine distance equation
was the best predictor of face discrimination tested thus far and
was the sole equation used to test the error in landmark placement
and to determine the sample of images from the database that
could be narrowed down for further verification by an operator.
In pattern matching based on the cosine distance between two
unit vectors, the returned measure can be interpreted as a match
probability. Whilst a cosine distance of 1 indicates a 100%
probability of the compared vectors being the same, and 0
indicates zero probability, a distance of 0.5 indicates the 50%
‘‘chance’’ level of correlation between compared vectors. The mean
of the TP distribution barely reaches this 50% level, although this of
course indicates that approximately half of the TP comparisons will
at least exceed a chance match value. A standard deviation of
Æ2.37 about the TP distribution mean of 0.48 indicates that over
17.5% of the TP matches will exceed a 70% chance of producing a best
closest match for the database tested.
4.3. Error in landmark placement
A small inter-operator study was carried out, to assess the
influence of landmark placement conducted by multiple operators.
It has been reported that landmark placement, tested on 3D images
in a clinical setting, reveals that average operator error can vary
widely [30]. Therefore the effect of landmark placement error is
important to test because although landmark placement on images
in the two samples used in this study was conducted by a single
operator, this would not likely occur in practice.
Facial landmarks were placed on a total of six video images,
chosen at random, six times each by five different operators. One
operator had previous experience in using the equipment and
Table 4
Summary of the conditions imposed and results achieved in the between sample TP and TN face comparisons. Mean, standard deviation, and standard deviation scaled
difference between TP and TN comparisons for Z-normalised vectors: mean absolute distance, Euclidean distance and cosine distance TP and TN data sets.
Comparison method
(all Z-normalised)
Sample Sample mean m Sample standard
deviation (SD)
m
Sample
SD
Sample
m
TP
SD
TP
À
m
TN
SD
TN

N, number
of samples
MAD TP 0.7771 0.2234 3.479 0.815 80
MAD TN 1.1053 0.2574 4.294 6320
Euclidean distance TP 7.594 2.258 3.3632 0.9660 80
Euclidean distance TN 10.572 2.442 4.3292 6320
Cos(u) TP 0.4822 0.2035 2.370 2.3950 80
Cos(u) TN À0.0061 0.2389 À0.02553 6320
Fig. 5. (a–c) Summary of the conditions imposed and results achieved in the
between sample TP and TN face comparisons. Results are illustrated by the
superimposed normal histogram curves showing the amount of overlap in TP
(dotted lower line) and TN faces (solid upper line). Mean absolute distance (a),
Euclidean distance (b) and cosine distance (c).
K.F. Kleinberg, J.P. Siebert / Forensic Science International 219 (2012) 248–258 253
knowledge of the landmarks; landmark locations were studied
using the definitions provided in [18] and [24]. The remaining
operators had no experience in using the equipment and no
previous knowledge of anthropometric landmarks. The inexperi-
enced operators were given a list of landmark definitions (Table 1)
adapted from the literature [18,24] as well as a single photocopy of
an enlarged male face (A4 sized), front facing, with previously
placed landmarks to use as a guide. The same equipment was used
by all operators and each operator conducted their landmark
placement of images in a single day. Using the Cosine u distance
equation, comparisons of re-landmarked images were analysed
first from the single experienced operator and second, from all
operators (Fig. 6a and b).
The Cosine u distance (Z-normalised) equation was used to
compare the re-landmarked images because, when applied in the
comparison of faces between samples, it was found to be the
equation in which the statistics of the TP and TN populations were
the most separated. Each face in the subset sample was compared
to every other face in the subset sample and resulting data was
illustrated as superimposed normal histogram curves of TP and TN
face matches. It was hypothesised that conducting an inter-
operator test, using high resolution research material, but
completed by inexperienced operators, would produce a greater
amount of variation than from an experienced operator and this
hypothesis was tested and found to hold.
The effect that inexperienced operators had on the separation
rate of TP and TN faces was compared to that of an experienced
operator and is summarised in Fig. 6a and b and Table 5. Compared
to that of the experienced operator, the effect of landmark
placement by inexperienced operators can clearly be seen in the
separation rates of TP and TN face matches: the mean for TP
comparisons collapses from 0.8 for the experienced operator to
0.44 for the mix of experienced and in-experienced operators.
The TP vs. TN standard deviation scaled means separation is more
than double for the experienced operator compared to that of the
mix of operators. A similar observation can be made by inspecting
the superimposed histograms for the TP and TN comparisons for
the experienced operator vs. the mix of experienced and
inexperienced operators. Although a bimodal result is generated
by the experienced operator for both TP and TN, a much greater
separation of the TP–TN distributions is observed. However, for all
operators a typical averaged picture emerges and the effect of the
‘‘expert’’ can be seen as a small additional bump at the top of the TP
distribution.
The most important point witnessed in Fig. 6a and b was to
observe the strong effect that the experienced operator had in
creating a larger separation of TP and TN face matches. As multiple
experienced operators were not tested, it cannot be stated that this
difference in separation rates between the experienced operator
and all operators was due to the experience of the operators or
instead, the effect that will naturally occur with multiple
operators. This could be tested by conducting a study using a
pool of experienced operators. A further study analysing the
distribution achieved from the re-landmarked images of each
operator after applying the Cosine u distance (Z-normalised)
equation could determine if any of the inexperienced operators
also achieved the same strong separation rate as the experienced
operator. An inexperienced operator producing a similar degree of
separation to the experienced operator would signify that the large
separation rate produced from all operators was caused by the
inclusion of multiple operators rather than their experience.
However, from the literature in [31], it can be predicted that the
spread from a single inexperienced operator would be larger than
an experienced operator.
4.4. Potential sample of photographs subject to manual verification
The amount of overlap between the distributions of TP and TN
faces illustrates an approximation of the misclassification rate (see
Fig. 5a–c). However, given the task of comparing a suspect’s image
to a large database of identity photographs, the ability to decrease
the number of possible face matches could potentially save
significant numbers of investigation hours. This smaller sample of
suspect photographs could then be more closely scrutinised by an
expert. For this analysis, each face in Sample 1 was compared to
Fig. 6. (a and b) Superimposed normal curve histograms illustrating TP (dotted
lower line) and TN (solid upper line) face comparisons of the Cosine u (Z-
normalised) distance equations in six re-landmarked images from Sample 1 using
one experienced operator (a) and multiple operators (b).
Table 5
Mean, standard deviation, and standard deviation scaled difference between TP and TN comparisons in landmark placement error study: mean absolute distance, Euclidean
distance and cosine distance for TP and TN data sets illustrating TP and TN face comparisons in six re-landmarked images from Sample 1 using one experienced operator and
multiple operators.
Comparison method: cosine distance (all Z-normalised) Sample Sample mean m Sample standard
deviation (SD)
m
Sample
SD
Sample
m
TP
SD
TP
À
m
TN
SD
TN

N, number
of samples
One experienced operator TP 0.8029 0.1489 5.392 5.8224 90
One experienced operator TN À0.1666 0.3871 À0.4304 540
Multiple operators: experienced and inexperienced TP 0.4409 0.2655 1.6606 2.01351 2610
Multiple operators: experienced and inexperienced TN À0.08657 0.2453 À0.3529 13,500
K.F. Kleinberg, J.P. Siebert / Forensic Science International 219 (2012) 248–258 254
each face in Sample 2 for a total of 80 comparisons. The Cosine u (Z-
normalised) distance equation was used and the resulting values
were placed in descending order, noting the rank of the TP. The best
match was defined as the match value that returned a Cosine u
value that was highest or closest to 1.0. This was used to determine
within a confidence range given a best match value how many
additional faces in the database would need to be verified before
the true positive match was found.
Best match values were placed in intervals of 0.1. The mean
rank of the TP, SD, and 2SD confidence interval for each match
interval was found and results shown in Table 6. In this instance
the confidence interval says that for within a given confidence
range, how many database images should be looked at in total.
Results in Table 6 indicate that given match values of 0.7 the best
match from the database is also likely to be the TP face. This result
is consistent with the observed degree of overlap between the TP
and TN distributions shown in Fig 5. A match threshold of 0.7 is not
an unreasonably high value to set, given that a distance of 0.5
indicates the 50% ‘‘chance’’ level of correlation between compared
vectors. A larger sample of images should be tested to determine if
results consistent with those presented here are produced.
5. Discussion
Using high resolution photographic research material, the
object of the study was to assess if a facial anthropometric feature
vector could be utilised to distinguish between individuals of a
similar age group, ancestry and sex. Given a database of subjects,
knowledge of the type of information gathered in this study may
help in future to narrow down the number of possible suspects in
an investigation. The technique presented here entailed analysing
vector comparisons to differentiate between images of two
samples. The feature vector was utilised in three types of equations
testing the differences between faces in the samples. Normal-
isation was applied to the ratio values as a way to equalise the
feature vector values in each sample and account (to some degree)
for the statistics that different camera parameters would produce.
Z-normalisation enhances any differences between means and
makes the interpretation of the data more straightforward
allowing small differences in the data can to be more simply
seen. We found that the face matching technology investigated in
this study can assist in a database search; however, it does not
provide an unequivocal means of confirming facial identifications
suitable to use in court. Therefore, the focus of future work should
concentrate on the potential for this approach to extract facial
information improving the search of databases and leaving
humans as the ultimate authenticator.
The first step to answering the objectives laid forth in the
introduction was to evaluate each sample of images to determine if
once the equations were applied, any differences could be seen
between the two samples. Testing faces against those found in the
same sample is important because it allows the equations to
ascertain if there are any differences between faces which fall
under the same conditions. This means that other than the
possibility of slight changes in facial expression the facial ratios
will be the only changeable variable between faces as all other
variables remain constant; same media, same operator placing
landmarks and same facial pose.
Once samples were looked at individually, a between sample
comparison was conducted to determine how distinguishable the
faces were in the two samples. Once the respective equations were
conducted, superimposed normal histogram distribution curves of
true positive faces and true negative faces were used to illustrate
the discrimination of the two groups. In general, a narrower
distribution was seen for the true positive faces. This was because
as the distribution contained only true positive matches, the data
should be centred on a smaller range of values. The amount of
overlap correlated to the possibility of achieving either a false
positive or false negative face match.
Although other research used the squared Euclidean distance to
measure the likeness between pairs of faces [32], we found by
superimposing the normal curves to demonstrate the separation
between true positive and true negative faces, the Cosine u
distance (Z-normalised) equation produced the least amount of
overlap between true positive faces and true negative faces when
statistics of the two samples were known. The match values of true
negative faces in the superimposed histogram normal curves begin
to trail off at 0.7, indicating that although it is still possible to
achieve a true negative identification above this value, it is likely
that a returned match score of below 0.7 will result in a true
negative face after closer examination. Although this result
occurred in this study, it may not be replicated with a larger test
database. The investigations undertaken in this study to determine
if it is possible to discriminate between individuals of two samples
using a multi dimensional facial feature vector found that the
Cosine u distance was the best discriminator this but could further
be improved upon by administering a more comprehensive
statistical analysis.
A small inter-operator study was carried out, to assess the
influence of landmark placement conducted by multiple operators.
This is important to test because although landmark placement on
all images used in the comparative process of this study was
conducted by a single operator, this would not likely be the case in
the real world. Landmark placement has been tested by other
researchers on 3D images in a clinical setting and it was suggested
that average operator error varies widely [30]. Using a digital
sliding calliper to measure photographs, researchers carried out an
intra observer study to test reliability of measurements and results
showed a low reliability in measurements of ls-sto and n-sn [33].
The current analysis was conducted with one experienced operator
but the remaining operators were inexperienced It would be
beneficial to analyse this data further in an inter-operator study
using experienced operators located in different graphical regions
because this scenario would be more likely as a police procedure.
Experience was shown to be a benefiting factor when the inter-
operator variation in taking standard skeletal measurements was
Table 6
Interval showing two standard deviations of how many images in the database should be manually investigated.
Interval of best
match values
n (number of best
matches in the interval)
Mean of TP rank SD of TP rank Min of TP rank Max of TP rank Number of images to
manually investigate
in database (mean + 2SD)
0.90–0.99 0 N/A N/A N/A N/A N/A
0.80–0.89 2 1 0 1 1 1
0.70–0.79 7 1 0 1 1 1
0.60–0.69 23 3.6 7.6 1 37 19
0.50–0.59 30 8.9 13.5 1 65 36
0.40–0.49 16 16.7 28.5 1 110 74
0.30–0.39 2 2 1.4 1 3 5
K.F. Kleinberg, J.P. Siebert / Forensic Science International 219 (2012) 248–258 255
tested with a panel of experienced forensic anthropologists and
found to be minimal [31]. For evidence interpretation in the court
of law, any variation that occurs in the data as a result of multiple
operators placing landmarks should be small.
Results achieved by the experienced operator illustrated a
strong separation rate of true positive and true negative faces;
however, once the data from the inexperienced operators were
included, the distribution no longer depicted the strong separation.
Locating exact landmark location on a photograph is difficult, even
for experienced operators, and could account for the wide variation
in measurement in this study. A further study analysing the
distribution achieved from the re-landmarked images of each
operator after applying the Cosine u distance (Z-normalised)
equation could determine if any of the inexperienced operators
also achieved the same strong separation rate as the experienced
operator. An inexperienced operator producing a similar degree of
separation to the experienced operator would signify that the
small separation rate produced from all operators was caused by
the inclusion of multiple operators rather than their experience.
However, from the literature [31], it can be predicted that the
spread from a single inexperienced operator would be larger than
an experienced operator.
Graphing the TP rank vs. the best match value can aid to
determine an approximation of how many images from the
database would need to be manually examined to find the TP face,
assuming the TP face was in the database. Once the unknown
image has been tested against all database images and the best
match noted, the graph could be consulted to find where the TP
was ranked and this gives an estimate of how many images should
be given additional verification. Both the figures showing TP rank
against TP match values and best match values indicate that if a
match occurs at 0.7 and above, then it may likely be a TP match and
is worthy of further examination. Values below 0.7 still produce TP
matches; however, there are also poor TP ranks in the same area.
This is an indication that a larger set of images should be tested in
order to get a more accurate graph. The Z-normalised Cosine u
equation used to obtain the match values can also be referred to as
a statistical correlation or match probability and thus a match
value of 0.5 corresponds to chance. Therefore it is reasonable to set
0.7 as a ‘‘good match’’ threshold. This was confirmed by binning the
matches into intervals. Finding the confidence interval based upon
the mean and standard deviations of the rank of TP match scores
for a given interval agreed with match scores of 0.7 and above were
considered good scores for returning a TP face. The conclusion from
studying this small sample of images is that if a match value is
returned above 0.7, there is a good chance that the best match will
be a TP, however, a larger sample of images in both databases
should be tested to determine if consistent results are produced as
well as giving a more accurate approximation of images.
The rate of misclassification tested thus far takes into account
the TP match between Sample 1 and Sample 2 plus any matches
deemed closer. However, in cases where the TP face was the best
match, it does not inform the researcher how close the next best
match was to the TP match. A TP match will hold more weight as
evidence when the next best match proves to be of significant
distance away. Future work would determine the distance
between the TP match and the next best match by finding the
log likelihood ratio between matches. This is the log of the ratio of
the second best match to the best match. The full analysis would
include finding the log likelihood ratio between each match based
upon a descending order of matches that would show their relation
to each other. A poor log likelihood ratio could be indicative of the
necessity for more images to be tested. Any thoroughly tested and
validated method of identification can only improve the adminis-
tration of justice. The basic principle of matching evidence from an
unknown suspect to a database of known individuals, as the
methods in this study were designed, could be used for police
casework as investigations are likely to be conducted in the same
manner. However, the method tested in this study of utilizing
anthropometric ratios to compare faces is not yet at the stage
where it would benefit prosecutor’s cases in the judicial system. At
present, identifying individuals through a comparison of anthro-
pometric measurements is restricted but can possibly be of benefit
if used in conjunction with evidence found at the scene and may
act as a form of corroborating evidence that when combined with
other evidence would serve to strengthen the identification. Using
a substantially different methodology, research reported by Davis
et al. [32], broadly support the same conclusions as described in
our paper. Although the chosen landmarks in the two studies were
similar, there were differences in the way these landmarks were
utilised, for example, different distances between landmarks
measured, different proportions analysed and the inclusion of
angular measurements.
Several limitations were encountered over the course of this
study. First, the number of images for comparison did not provide
the ideal sample size in relation to the number of elements tested.
The method was tested on images of a similar physiognomy, as
occurs in practice, and these were the images available that suited
the criteria.
The landmarks, linear measurements, and to a lesser extent
ratios, were chosen for this study based on their use in previous
research. The number of landmarks chosen was not in question as
it was believed that enough landmarks were selected to gain an
acceptable representation of the face. The quantity of ratios was
selected for this same reason. However, even though the number of
ratios tested was considerably smaller than the possibilities based
on the number of landmarks, it is not known if all 59 ratios were
paramount to the analysis. Incorporating the 1,772,892 ratio
possibilities into a feature vector is something that could be tested
in further study by standard statistical methods such as a PCA and
factor analysis. It would then be necessary to determine which of
those would be beneficial and which would hinder the comparison
process. Depending on the pose of the image, it would be suitable
to know which ratios were valuable.
Another limitation of the study was that a more comprehensive
statistical analysis could have further improved the data analysis. A
common precursor to further statistical investigation is a
multivariate technique called a Procrustes analysis. A Procrustes
analysis would match landmarks or shapes from two sets of data
removing the variation of translation, rotation and scaling in the
data so that the data becomes a single frame of reference.
Statistical tests to compare samples such as Students t-test [34] or
Hotelling’s, which is the multivariate equivalent of the t-test [35],
could be applied to the data investigated in this study; however,
the results achieved thus far can be interpreted and determined
from looking at the data directly. Additional analysis may focus on
creating receiver operating characteristics (ROC) graphs [36]. This
technique separates classifiers based on their performance; in this
case true positive and false positive face matches. This is helpful to
visualise the overall efficacy of the comparison method because it
allows the operator to appraise how many false positives matches
are likely to occur with true positives matches.
This study tested the outcome of comparing two sets of 2D
images in a best-case scenario. Video images were of a high
resolution compared to that of typical grainy surveillance video
often found and all faces were positioned to the front with neutral
expressions. Best-case scenario images were utilised to set a
benchmark; if identifications could not be made on these images
then there was less hope for identification based on images of a
lower quality. Two-dimensional images were used because they
are more accessible compared to 3D images and can be directly
obtained from surveillance video without the liability of
K.F. Kleinberg, J.P. Siebert / Forensic Science International 219 (2012) 248–258 256
potentially changing data through manipulation into a 3D image.
Although not utilised in this study, the creation of 3D images has
its place and is important to the field of facial recognition to
address the fundamental problem of rotation variation amongst
images. A great deal of effort by researchers has been spent on
creating a 3D facial image [7,8,37,38], but 3D images are not
without their problems as well. Discouraging results were seen
when comparing a 2D image with that of a 3D laser scan image
by comparing the locations of the X and Y coordinates of a
maximum of seven landmarks [39]. The goal of this was to
manipulate the head positioning to mirror that of the unknown
individual [39]. The set of comparisons was small and combined
with the fact that only seven landmarks were used could have
influenced the poor outcome. In reality, it may not be feasible to
take a 3D image scan of a possible suspect as the subject will
probably not agree, but using photographs taken of the suspect
at several different angles to create a 3D image of the suspect
would be possible [7,8].
Another problem effect which may affect the comparison
between individuals from 2D images, apart from matching head
positions, is distortion. The effect of distortion on photographs may
affect the ability to compare images taken at different times with
different camera parameters. The focal length of the camera lens
and the subject distance from the camera are factors that
contribute to distortion, however, lens distortion can be corrected
by calibration. Farkas found that the greatest effect from distortion
was shortening of the upper third of the face and that one reason
the nasion and stomion landmarks proved to be accurate was
because they were on the same focusing plane [18].
In a photogrammetric analysis, the worst-case scenario is when
nothing is known about the cameras that captured the images. In
this type of circumstance there is a strong danger of generating any
comparison to ‘fit’, whereas compelling comparisons can be made
from calibrated images. The camera parameters were unknown for
both sets of images utilised in the present study. In practice it is
likely that camera parameters will be known for at least one set of
images. In a photogrammetric analysis, Lee et al. found it necessary
to be aware of the distortion parameter of a camera lens in addition
to obtaining a sufficient number of calibration points in order to
effectively measure the height of a person standing in a fixed
location [40]. Camera parameters of video images obtained from
surveillance cameras can be acquired using basic photogrammetry
skills. In an ideal comparison, nothing would deviate between the
two cameras parameters, or at most only the focal length would be
varied.
A significant problem with distinguishing between two 2D
images is that the face is a complex 3D structure and any
comparison protocol should therefore be designed with that in
mind. Future research should concentrate on creating a 3D
reconstruction of the suspect from their police identity photograph
and this 3D image could then be matched to the facial position of
the individual in the video image. Creating a 3D image should
demonstrate a unique fit factor and superimposing a 2D image
(individual on video) onto a 3D image (suspect photograph) will
clearly exhibit the distribution of the fitting error vs. the pose angle
and individuality of the person. It is possible to extract 3D
landmarks from video by matching two or more frames and
applying photogrammetry [41], however, facial expression can be
a confounding factor.
The general conclusion derived from the investigations
undertaken in this study was that these tests do not offer a
significant and infallible method of discriminating between
individuals of two samples. At best they may offer corroborating
evidence and could be used to narrow down a list of suspects as
long as other evidence was available. However, it should be noted
as digital camera technology improves, the potential for capturing
high quality imagery is becoming more relevant and favours this
approach. The cases tested in this research were done as a best case
scenario; facial poses in both samples were faced frontal, images
from both samples were taken on the same day, landmarks on
images from both samples were placed by the same operator and
the resolution of video was high. These ‘best case’ samples would
most likely not be available to forensic scientists but the
advantages for further testing on worst case scenarios are non-
existent if the discriminatory power is not sufficient with the best
cases.
Acknowledgements
We would like to thank A. Mike Burton in the School of
Psychology at the University of Aberdeen for providing the police
photographs used in this research.
References
[1] D.L. Lewis, Surveillance video in law enforcement, J. Forensic Ident. 54 (2004)
547–559.
[2] S.A. Cole (Ed.), Suspect Identities: A History of Fingerprinting and Criminal
Identification, Harvard University Press, Cambridge, MA, 2002 , pp. 32–59,
140–146.
[3] A.A. Moenssens, Fingerprint Techniques, Chilton Book Company, Philadelphia,
1971.
[4] N.L. Rogers, et al., The belated autopsy and identification of an eighteenth century
naval hero-The saga of John Paul Jones, J. Forensic Sci. 49 (2004) 1036–1049.
[5] ‘‘R (on the application of Taj) v Chief Immigration Officer, Midlands Enforcement
Unit, Queen’s Bench Division (Administration Court), CO/1084/99, 29 January
2001,’’ ed, 2001.
[6] ‘‘Crim. L.R. 1999, SEP,’’ ed, 1999, pp. 750–751.
[7] J.P. Siebert, S.J. Marshall, Human body 3D imaging by speckle texture projection
photogrammetry, Sensor Rev. 20 (2000) 218–226.
[8] C.W. Urquhart, J.P. Siebert, Towards Real-time Dynamic Close Range Photogram-
metry, Presented at the SPIE Videometrics II, Boston, USA, 1993.
[9] R.A. Halberstein, The application of anthropometric indices in forensic photogra-
phy: three case studies, J. Forensic Sci. 46 (2001) 1438–1441.
[10] Q. Chen, W. Cham, 3D model based pose invariant face recognition froma single
frontal view, Electron. Lett. Comp. Vis. Image Anal. 6 (2007) 13–26.
[11] F.J. Huang, et al., Pose invariant face recognition, in: Fourth IEEE International
Conference on Automatic Face and Gesture Recognition Grenoble, France, 2000.
[12] U. Park, A.K. Jain, Face matching and retrieval using soft biometrics, Inf. Foren. Sec.
IEEE Trans. 5 (2010) 406–415.
[13] Databases. Available: 8 June http://www.interpol.int/Public/ICPO/FactSheets/
GI04.pdf.
[14] ‘‘Reoffending of adults: results from the 2006 cohort England and Wales,’’
Ministry of Justice Statistics bulletin, 4 September 2008.
[15] K.F. Kleinberg, et al., Failure of anthropometry as a facial identification technique
using high-quality photographs, J. Forensic Sci. 52 (2007) 779–783.
[16] K.V. Mardia, et al., On statistical problems with face identification from photo-
graphs, J. Appl. Stat. 23 (1996) 655–675.
[17] V. Bruce, et al., Verification of face identities from images captured on video, J.
Exp. Psychol. Appl. 5 (1999) 339–360.
[18] L.G. Farkas, Anthropometry of the Head and Face, 2nd ed., Raven Press, Ltd., New
York, 1994.
[19] R. Purkait, Anthropometric landmarks: how reliable are they? Anthropometric
landmarks, Med. Leg. Update 4 (2004) 133–140.
[20] N. Fieller, Statistical facial identification. 2006 [cited 2007, 12 September];
available from: http://nickfieller.staff.shef.ac.uk/seminars/faces04-10-06.pdf.
[21] M. Evison, R. Bruegge, The magna database: a database of threedimensional facial
images for research in human identification and recognition, Forensic Sci. Com-
mun. (April) (2008).
[22] I. Craw, et al., How should we represent faces for automatic recognition, IEEE
Trans. Pattern Anal. Machine Intelligence 21 (1999) 725–735.
[23] K. Okada, C.V.D. Malsburg, S. Akamatsu, A pose-invariant face recognition system
using linear pcmap model, in: Proceedings of IEICE Workshop of Human Infor-
mation Processing, Okinawa, 1999.
[24] J.C. Kolar, E.M. Salter, Craniofacial Anthropometry Practical Measurement of the
Head and Face for Clinical, Surgical and Research Use, Charles C Thomas,
Springfield, IL, 1997.
[25] K.F. Kleinberg, P. Vanezis, Variation in proportion indices and angles between
selected facial landmarks with rotation in the Frankfort plane, Med. Sci. Law 47
(2007) 107–116.
[26] G. Porter, G. Doran, An anatomical and photographic technique for forensic facial
identification, Forensic Sci. Int. 114 (2000) 97–105.
[27] M. Yoshino, et al., Computer-assisted facial image identification systemusing a 3-
D physiognomic range finder, Forensic Sci. Int. 109 (2000) 225–237.
[28] T. Catterick, Facial measurements as an aid to recognition, Forensic Sci. Int. 56
(1992) 23–27.
K.F. Kleinberg, J.P. Siebert / Forensic Science International 219 (2012) 248–258 257
[29] I.L. Dryden, K.V. Mardia, Statistical Shape Analysis, Wiley-Blackwell, West Sussex,
1998.
[30] A. Ayoub, et al., Validation of a vision-based, three-dimensional facial imaging
system, Cleft Palate: Cran. J. 40 (2003) 523–529.
[31] B.J Adams, J.E. Byrd, Interobserver variation of selected postcranial skeletal
measurements, J. Forensic Sci. 47 (2002) 1193–1202.
[32] J.P. Davis, T. Valentine, R.E. Davis, Computer assisted photo-anthropometric
analyses of full-face and profile facial images, Forensic Sci. Int. 200 (2010)
165–176.
[33] M. Roelofse, et al., Photo identification: facial metrical and morphological features
in South African males, Forensic Sci. Int. 177 (2008) 168–175.
[34] B. Murphy, R.D. Morrison, Introduction to Environmental Forensics, Academic
Press, 2007.
[35] D. Sheskin, Handbook of Parametric Nonparametric Statistical Procedures, Chap-
man Hall/CRC, 2007.
[36] T. Fawcett, An introduction to ROC analysis, Pattern Recogn. Lett. 27 (2006)
861–874.
[37] D. DeCarlo, et al., An anthropometric face model using variational techniques, in:
Proceedings of the 25th Annual Conference on Computer Graphics and Interactive
Techniques, 1998, pp. 67–74.
[38] C. Zhang, S.F. Cohen, 3-D face structure extraction and recognition from images
using 3-D morphing and distance mapping, IEEE Trans. Image Proc. 11 (2002)
1249–1259.
[39] M.I.M. Goos, et al., 2D/3D image (facial) comparison using camera matching,
Forensic Sci. Int. 163 (2006) 10–17.
[40] J. Lee, et al., Efficient height measurement method of surveillance camera image,
Forensic Sci. Int. 177 (2008) 17–23.
[41] H.C. Longuet-Higgins, A computer algorithmfor reconstructing a scene fromtwo
projections, Nature 293 (1981) 133–135.
K.F. Kleinberg, J.P. Siebert / Forensic Science International 219 (2012) 248–258 258