Professional Documents
Culture Documents
Design
by S. N. Yanushkevich
Chapter 3
Biometric Methods and Techniques: Part I –
Statistics
FRR
1
Equal error rate (EER)
• Computing type I errors (FRR)
• Computing type II errors (FAR)
• System performance evaluation
FAR
0
1
S.N. Yanushkevich, Fundamentals of Biometric System Design 2
Preface
The key methodology of the measurement in biometric tech-
nology is the engineering statistics.
The key methodology of biometric data processing is signal
processing and pattern recognition.
The key performance metrics in biometrics are related to
matching rates (false match rate, false nonmatch rate, and
failure-to-enroll rate).
The crucial point of biometric system design is the measuring of biometric data. It is
mandatory knowledge of all biometric design teams. In the engineering environment, the
data is always a sample1 selected from some population. For example, the calculation
the reliability parameters of biometric data, such as the confidence interval and the
sample size, is a typical problem of the experimental study, reliability design and quality
control. Engineering statistics provides various tools for assessing the of biometric data,
in particular:
◮ Techniques for estimation of mean, variance, and correlation,
◮ Techniques for computing confidence intervals,
◮ Techniques for hypothesis testing, and
◮ Techniques for computing type I and type II errors.
In biometric system, decision making is based on statistical criteria. This is because
biometric data is characterized by their high variability. Every time a user presents
biometric data, a unique template2 Given the type of biometric, even two immediately
successive samples of data from the same user generate entirely different templates.
Statistical techniques are used to recognize that these templates belong to the same
person. For example, a user may place the same finger on biometric device several times
and all generated templates will be different. To deal with this variety, statistical tools
must be used.
For processing the raw biometric data, various techniques of the signal or image
processing, pattern recognition, and decision making are used, in particular,
◮ 2D discrete Fourier transform,
◮ Filtering in spatial and frequency domains using Fourier transform,
1
In biometric system, sample is a biometric measure submitted by the user and captured by the data
acquisition tool.
2
A template is a small file derived from the distinctive features of a user’s biometric data. A template
is used in a system to perform biometric matches. Biometric system store and compare templates derived
from biometric data. Biometric data cannot be reconstructed from a biometric template.
S.N. Yanushkevich, Fundamentals of Biometric System Design 3
◮ Classifiers, and
◮ Pattern recognition module design.
◮ Statistical methods,
◮ Methods of signal processing, and
◮ Methods of pattern recognition.
Step 7: Conduct an appropriate experiment to confirm that the proposed design solutions are both effective
and efficient with respect to given criteria.
Step 8: Draw conclusions or make recommendations based on design solutions.
The field of statistics deals with the collection, presentation, analysis, and use of
data to make decisions. Statistical techniques are used in all phases of biometric system
design, their comparison and testing, and improving existing designs. Statistical methods
are used to help us describe and understand variability. By variability, we mean that
any successive observation of a biometric system or biometric phenomenon does not
produce an identical result. Because the measurements exhibit variability, we say that
the measured parameter is a random variable. A convenient way to think of a random
variable, say X, which represents a measured quantity, is by using an appropriate model,
for example,
Random variable X = Constant µ + Noise ǫ
.
In the engineering environment, the data is almost always a sample that has been
selected from some population. In biometric system design, data is collected in three
ways:
◮ A retrospective study based on historical data; the engineer uses either all or one
sample of the historical process data from some period of time; for example, bio-
metric data from data bases, data from previous experimental study, etc.
◮ An observation study; the engineer observes the process during a period of routine
operation; for example, facial expressions, signatures, etc.
◮ A designed experiment; the engineer makes deliberate or purposeful changes in
controllable variables called factors of the system, observes the system output,
and makes a decision or an inference about which variables are responsible for the
changes that he/she observes in the system output; for example, feature extraction
from biometric data using an appropriate algorithm
Statistical hypothesis
Many problems in biometric system design require that we decide whether to accept or
reject a statement about some parameters. The statement is called a hypothesis and the
decision-making procedure about the hypothesis is called hypothesis testing.
Statistical hypothesis
A statistical hypothesis is an assertion or conjecture concerning one or more populations.
The truth or false of statistical hypothesis is never known with absolute certainty, unless we
examine the entire population. This is impractical. Instead, we take a random sample from
the population of interest and use the data contained in this sample to provide evidence that
either supports or does not support the hypothesis (leads to rejection of the hypothesis).
The decision procedure must be done with the awareness of the probability of the wrong
conclusion. The rejection of a hypothesis implies that the sample evidence
refutes it. In other words: The rejection means that there is a small probability
of obtaining the sample information observed when, in fact, the hypothesis
is true.
The structure of hypothesis testing is formulated using the term null hypothesis. This
refers to any hypothesis we wish to test and is denoted by H0 . The rejection of H0 leads
to the acceptance of an alternative hypothesis, denoted by H1 .
Because in Example 2 the alternative hypothesis specifies the values of µ that could
be either greater or less that 50, it is called a two-sided alternative hypothesis. In
some situations, we may wish to formulate a one-sided alternative hypothesis:
Null hypothesis H0 : µ = 50
One-sided alternative hypothesis H1 : µ < 50 or
One-sided alternative hypothesis H1 : µ > 50
Null hypothesis H0 : µ = a
Two-sided alternative hypothesis H1 : µ 6= a
Suppose that a data sample of n is tested, and that the sample mean x is observed.
The sample mean is an estimate of the true population mean µ = a. A value of the
sample mean x, that falls close to the hypothesized value of µ, is the evidence that the
true mean µ is really a; that is, such evidence supports the null hypothesis H0 . On the
other hand, a sample mean x that is considerably different from a, is evidence in support
of the alternative hypothesis H1 . Thus, the sample mean represents the test statistics.
Therefore, we reject H0 in favor of H1 if the test statistic falls in the critical region,
and fails to reject H0 otherwise. This decision procedure can lead to either of the two
wrong conclusions:
Type I error or False reject rate (FRR): is defined as rejecting the null hypothesis
H0 when it is true. The type I error is also called the significant level of the test.
The probability of making a type I error is
Type II error or False accept rate (FAR): is defined as failing to reject the null hy-
pothesis when it is false. The probability of making a type II error is
This is often hard to do, and what has evolved in much of biometric system design is
to use the value α = 0.05 in most situations, unless there is information available that
indicates that this is an inappropriate choice.
Type II error. The probability of type II error, β, is not a constant. It depends on
both the true value of the parameter and the sample size that we have selected. Because
the type II error probability β is a function of both, the sample size and extent to which
the null hypothesis H0 is false, it is customary to think of the decision not to reject H0
as a weak conclusion, unless we know that β is acceptably small. Therefore, rather
than saying we “accept H0 ” we prefer the terminology “fail to reject H0 ”.
Failing to reject H0 implies that we have not found sufficient evidence to reject H0 ,
that is, to make a strong statement. Failing to reject H0 does not necessarily mean
there is a high probability that H0 is true. It may simply mean that more data
are required to reach a strong conclusion. This can have important implications
for the formulation of hypotheses.
The power of a statistical test is the probability of rejecting the null hypothesis
H0 when the alternative hypothesis is true. The power is computed as
is equal to the sum of the areas that have been shaded in the tails of the normal distribution. We
may find this probability as Left tail x2
z }| { z}|{
Probability of type I error, α = P (X < |{z} 48.5 when µ = 50) + P (X > 51.5 when µ = 50)
| {z }
x1 Right tail
The z-values that correspond to the critical values 48.5 and 51.5 are calculated as follows:
x1 − µ 48.5 − 50 x2 − µ 51.5 − 50
z1 = √ = = -1.90 and z2 = √ = = 1.90
σ/ n 0.79 σ/ n 0.79
48.5 − 50 51.5 − 50
z1 = = −2.40 and z2 = = 2.40
0.625 0.625
To calculate β, we must have a specific alternative hypothesis; that is, we must have a particular
value of µ. For example, suppose we want to reject the null hypothesis H0 : µ = 50 whenever
the mean µ is grater than 52 or less than 48. We could calculate the probability of type II error
β for the values µ = 52 and µ = 48, and use this result to tell us something about how the
test procedure would perform. Because of the symmetry of normal distribution function, it is
only necessary to evaluate one of the two cases, say, find the probability of not rejecting the null
hypothesis H0 : µ = 50 when the true mean is µ = 52.
Step 5: (continuation)
Now the type II error will be committed if the sample mean x falls between 48.5 and 51.5 (the
critical region boundaries) when µ = 52. This is the probability that 48.5 ≤ X ≤ 51.5 when the
true mean is µ = 52, or the shaded area under the normal distribution on the right, that is
Fig. 3: Techniques for computing type I and type II errors (continuation of Example 4).
S.N. Yanushkevich, Fundamentals of Biometric System Design 13
Step 7: (Continuation)
Conclusion: The type II error probability is much higher for the case in which the true mean is 50.5
than for the case in which the mean is 52. Of course, in many practical situations, we would not be as
concerned with making a type II error if the mean were “close” to the hypothesized value. We would be
much more interested in identifying the large differences between the true mean and the value specified
in the null hypothesis.
This β = 0.2119 is smaller than β = 0.2642, so we decrease the probability of accepting the false
hypothesis H0 by increasing the sample size.
The type II error, reduced by increasing the sample size from n = 10 to n = 16, is 0.2119
Fig. 4: Techniques for computing type I and type II errors(continuation of Example 4).
S.N. Yanushkevich, Fundamentals of Biometric System Design 14
Conclusion: The sensitivity of the test for detecting the difference between a mean of 50 and 52
is 0.7357 . That is, if the true mean is really 52, this test will correctly reject H0 : µ = 50 and
“detect” this difference 73.57% of the time. If this value of power is judged to be too low, the designer
can increase either α or the sample size n.
Fig. 5: Techniques for computing type I and type II errors (continuation of Example 4).
Let 0 < α < 1, then the interval a < θ < b, computed from the selected sample, is
called a 100(1 − α)% confidence interval, the fraction 1 − α is called the degree of
confidence, and the endpoints a and b, are called the lower and the upper confidence
limits.
If x is the mean of a random sample of size n from a population with known variance
σ 2 , a 100(1 − α)% confidence interval for µ is given by
σ σ
x − z α2 √ < µ < x + z α2 √ (1)
n n
s
Confidence interval = x ± z α2 √
n
may be used. This is often referred to as a large-sample confidence interval. The justifi-
cation lies only in the presumption that with a sample as large as 30 and the population
distribution not too skewed, s (standard deviation from the sample) will be very close
to the true σ and, thus, the central limit theorem prevails. It should be emphasized that
this is only an approximation, and the quality of the approach becomes better as the
sample size grows larger.
S.N. Yanushkevich, Fundamentals of Biometric System Design 15
The 100(1−α)% confidence interval provides an estimate of the accuracy of our point
estimate. If µ is actually the center value of the interval, then x estimates µ without
error. However, in most cases, x will not be exactly equal to µ and the point estimate is
an error.
Theorem 2 is applicable only if we know the variance of the population from which we
are to select our sample. Lacking this information, we could take a preliminary sample of
size n ≥ 30 to provide an estimate σ. Then, using this estimate as an approximation for
σ in Theorem 2, we could determine approximately how many observations are needed
to provide the desired degree of accuracy.
S.N. Yanushkevich, Fundamentals of Biometric System Design 16
Problem formulation:
Let the hand geometry measurement results in a sample size of n = 36, the
sample mean x = 2.6, and the population standard deviation is σ = 0.3.
Calculate:
(a) the 95% and 99% confidence intervals for the mean between these hand
points;
(b) the accuracy of a point estimate using Theorem 1
(c) the sample size, if we want to be 95% confident that our estimate of µ
does not exceed 0.05 (Theorem 2)
Step 1: If x is the mean of a random sample of size n from a population with known variance
σ 2 , a 100(1 − α)% confidence interval for µ is given by
σ σ
x − z α2 √ < µ < x + z α2 √
n n
α
where z α2 is the z-value leaving an area of 2 to the right.
Step 2: For α = 0.05 , n = 36, x = 2.6, and σ = 0.3, the 95% confidence interval is
0.3 0.3
2.6 − (1.96) √ < µ < 2.6 + (1.96) √ , that is, 2.50 < µ < 2.70
| {z } 36 | {z } 36
z0.05/2 z0.05/2
Note that z−value, leaving an area of 0.025 to the right, and, therefore, an area of 0.975 to the
left, is z 0.05
2
= z0.025 = 1.96 (see the table)
Step 3: For α = 0.01 , n = 36, x = 2.6, and σ = 0.3, the 99% confidence interval is
0.3 0.3
2.6 − (2.575) √ < µ < 2.6 + (2.575) √ , that is, 2.47 < µ < 2.73
| {z } 36 | {z } 36
z0.01/2 z0.01/2
Note that z−value, leaving an area of 0.005 to the right, and, therefore, an area of 0.995 to the
left, is z 0.01
2
= z0.005 = 2.575 (see the table)
Observation: A longer interval is required to estimate µ with a higher degree of confidence.
Decision: Based on Theorem 1, we are 95% confident that the sample mean x = 2.6 differs from
the true mean µ by an amount that is less than
σ 0.3
z α2 × √ = (1.96) √ = 0.98
n 36
By analogy, we are 99% confident that the sample mean x = 2.6 differs from the true mean µ by
an amount that is less than
σ 0.3
z α2 × √ = (2.575) √ = 0.13
n 36
that quantifies the similarity between the input XQ and the template XI representations.
This similarity can be encoded by a single number.
S.N. Yanushkevich, Fundamentals of Biometric System Design 18
Fig. 7: Basic definitions and terminology that are used in biometric system design.
S.N. Yanushkevich, Fundamentals of Biometric System Design 19
False match rate (FMR) is the expected probability that a sample will be falsely
declared to match a single randomly-selected non-self template; that is, measure-
ments from two different persons are interpreted as if they were from the same
person.
False non-match rate (FNMR) is the expected probability that a sample will be
falsely declared not to match a template of the same measure from the same user
supplying the sample; that is, measurements from the same person are treated as
if they were from two different persons.
Equal error rate (EER) is the value defined as EER=FMR=FNMR, that is, the
point where false match and false non-match curves cross is called equal error rate
or crossover rate. The EER provides an indicator of the system’s performance: a
lower EER indicates a system with good level of sensitivity and performance.
Difference between false match/non-match rates and false accept/reject rates is in-
troduced in Fig. 8.
To attack a biometric-based system, one needs to generate (or acquire) a large number
of samples of that biometric, which is much more difficult than generating a large number
of PINs/passwords. The FMR of a biometric system can be arbitrarily reduced for higher
security at the cost of increased inconvenience to the users that results from a higher
FNMR. Note that a longer PIN or password also increases the security while causing
more inconvenience in remembering and correctly typing them.
S.N. Yanushkevich, Fundamentals of Biometric System Design 23
Biometric False
False match V erif ication V erif ication
←−←− −→−→ non-match
rate system
rate
| {z }
# of comparisons
| {z }
# of comparisons
◮ False accept/reject rates are calculated over transactions and refer to the acceptance
or rejection of the stated hypothesis, whether positive or negative:
Fig. 8: Difference between false match/non-match rates and false accept/reject rates.
In fact, the tradeoff between the FMR and FNMR rates in a biometric system is no
different from that in any detection system, including the metal detectors already in use
at all the airports. Other negative recognition applications such as background checks
and forensic criminal identification are also expected to operate in semi-automatic mode
and their use follows a similar cost-benefit analysis.
The EER (equal error rate) is defined as the crossover point on a graph that has both
the FAR and FRR curves plotted.
The FMR (FAR) and FNMR (FRR) are related and must be balanced (Figure 9).
For example, in access control, perfect security would require denying access to everyone.
Conversely, granting access to everyone would mean no security. Obviously, neither
extreme is reasonable, and biometric system must operate somewhere between the two.
S.N. Yanushkevich, Fundamentals of Biometric System Design 25
◮ A FMR and FAR occurs when ◮ A FNMR and FRR occurs when a system
a system incorrectly rejects a valid identity; A FNMR
matches an identity; (FRR) is the probability of valid
A FMR (FAR) is individuals being wrongly not
the probability of matched.
individuals being ◮ False non-matches occur because
wrongly matched. there is not a sufficiently strong
◮ False matches may occur similarity between individuals’
because there is enrollment and trial templates,
a high degree of which could be caused by any
similarity between number of conditions. For
two individuals’ example, an individual’s biometric
characteristics. data may have changed as a result
◮ In a verification and of aging or injury.
positive identification ◮ In verification and positive
system, unauthorized identification system, people
people can be granted can be denied access to some
access to facilities or facility or resource as a result
resources as a result of a system’s failure to make a
of an incorrect match. correct match.
◮ In a negative ◮ In a negative identification system,
identification system, the result of a false non-match
the result of a false may be that a person is granted
match may be to deny access to resources to which
access. he/she should be denied.
FMR (FAR) and FNMR (FRR) are related and must, therefore, always be
assessed in tandem, and acceptable risk levels must be balanced with the
disadvantages of inconvenience.
Forensic
applications
High security
Civilian
applications
applications
Fig. 10: Typical operating points of different biometric applications are displayed on the
ROC curve.
◮ When the threshold T is set low, the FMR is high and the
FRR FNMR is low; when T is set high, the FMR is low and
the FNMR is high.
◮ For a given matcher, operating point (a point on the ROC)
1
is often given by specifying the threshold T .
Equal error rate (EER) ◮ In biometric system design, when specifying an applica-
tion, or a performance target, or when comparing two
matches, the operating point is specified by choosing
FMR or FNMR.
FAR
◮ The equal error operating point is defined as the EER. The
0 matcher can operates with highly unequal FMR and
1 FNMR; in this case, the EER is unreliable summary of
system accuracy.
Problem formulation:
FRR
Matcher A
1 In biometric system design, two types of matches are spec-
Matcher B
ified, type A and type B matcher. These matches are
described by the ROC curves. Figure in the left shows the
corresponding ROCs for these matches and their operating
Target FNMR
points for some specified target FNMR. The problem is to
choose the better matcher.
0 FAR
a b 1
Conclusion
The matcher A is better than matcher A for all possible thresholds T .
Fig. 12: Techniques for comparison two matchers using the ROC curves (Example 12).
integration would be
k
X
Confidence intervals 1 − β = P (i ≤ k) = b(i; n, p) (4)
i=0
| {z }
Available from Table
where binomial sums are available and given table for different values of n and p is given.
S.N. Yanushkevich, Fundamentals of Biometric System Design 29
= 0.8779
5
X 4
X
(c) P (i = 5) = b(5; 15, 0.4) = b(i; 15, 0.4) − b(i; 15, 0.4)
i=0 i=0
| {z } | {z }
From Table: 0.4032 From Table: 0.2173
= 0.1859
The Rule of 3
3
Error rate p ≈ for a 95% confidence level (5)
N
2
Error rate p ≈ for a 90% confidence level (6)
N
The “Rule of 30” Doddington5 proposes the Rule of 30 for helping determine the
test size: To be 90% confident that the true error rate is within ±30% of the
observed value, we need at least 30 errors.
The Rule of 30
To be 90% confident that the true error rate is within
5
Doddington, G.R., Przybocki, M.A., Martin, A.F., and Reynolds, D.A. The NIST speaker recog-
nition evaluation: Overview methodology, systems, results, perspective. Speech Communication, 2000,
31(2-3), 225-254.
S.N. Yanushkevich, Fundamentals of Biometric System Design 32
References
[1] Bolle R., Connell J., Pankanti S., Ratha N., and Senior A. Guide to Biometrics. Springer,
Heidelberg, 2004.
[2] Germain R. S., Califano A., and Coville S. Fingerprint matching using transformation
parameter clustering. IEEE Computational Science and Engineering, pp. 42–49, Oct.-
Dec. 1997.
[3] Gonzalez R. C., Woods R. E., and Eddins S. L. Digital Image Processing Using MATLAB.
Pearson, Prentice Hall, 2004.
[5] Rangayyan R.M. Biomedical Image Analysis. CRC Press, Boca Raton, FL, 2005.
[6] Richards A. Alien Vision: Exploring the Electromagnetic Spectrum with Imaging Tech-
nology. SPIE, 2001.
S.N. Yanushkevich, Fundamentals of Biometric System Design 33
4 Problems
Problem 1: The distances Di between feature points measured in a sample of signatures are
represented by a normally distributed random variable d with the following parameters of the
mean µ and standard deviation σ, n(d; µ, σ) (Fig. 13a):
(a) If µ = 40 and σ = 1.5, calculate the probability P (39 < d < 42)
Solution:
d1 − µ d2 − µ
Step 1: <z<
σ σ
39 − 40 42 − 40 Standard normal
<z< distribution
1.5 1.5 n(z;0,1)
= P (z < 1.33) − P (z < −0.67) = 0.6568 P(− 0.67 < z < 1.33)
d1 − µ d2 − µ
Step 1: <z<
σ σ
1.5 − 5 4.5 − 5
<z< Standard normal
1.58 1.58 distribution
n(z;0,1)
that is − 2.22 < z < −0.32
Step 2: P (1.5 < d < 4.5)
= P (−2.22 < z < −0.32) z
Step 3: P (−2.22 < z < −0.32) − 2.22 0.32 0
Di
Di
(a) (b)
Fig. 13: The distances Di between feature points measured in a signature (a) and hand
(b) are represented by a normally distributed, n(d; µ, σ), random variable d with
the mean µ and standard deviation σ (Problems 1 and 2) .
Problem 2: The distances Di between feature points measured in a sample of signatures are
represented by a normally distributed, n(d; µ, σ), random variable d with the mean µ = 10 and
standard deviation σ = 1.5 (Fig. 13b). Calculate
Problem 3: The sample of distances Di between feature points measured on a retina image are
represented by a normally distributed, n(d; µ, σ), random variable d (Fig. 14a). The sample
size is n = 36 and the sample mean is d = 2.6. The standard variance, σ, of the population is
assumed σ = 0.3. Calculate:
0.3 0.3
2.6 − 1.645 √ < µ < 2.6 + 1.645 √
36 36
2.52 < µ < 2.68 z
−1.645 0 1.645
0.3 0.3
2.6 − 1.96 √ < µ < 2.6 + 1.96 √
36 36
2.50 < µ < 2.70 z
−1.96 0 1.96
Answer: With 95% confidence, the true mean lies 2.5% 2.5%
σ σ
d − z α2 √ < µ < d + z α2 √
n n
α Standard normal
for α = 0.01, = 0.005 z0.005 = 2.575 distribution
2 n(z;0,1)
0.3 0.3
2.6 − 2.575 √ < µ < 2.6 + 2.575 √
36 36
2.47 < µ < 2.73 z
−2.575 0 2.575
Answer: With 99% confidence, the true mean lies 0.5% 0.5%
(a) (b)
Fig. 14: The distances Di between feature points measured in a retina (a) and gait (b)
are represented by a normally distributed, n(d; µ, σ) (Problems 3 and 4).
Problem 4: The sample of distances Di between feature points measured in a sample of retina
are represented by a normally distributed, n(d; µ, σ), random variable d (Fig. 14b). The sample
size is n = 49 and the sample mean is d = 4.0. The standard variance, σ, of the population is
assumed σ = 0.2. Calculate:
Problem 5: How large is the size of sample considered in Problem 3, if we want to be:
(a) 90% confident that our estimate of µ is off by less than 0.05.
Solution: Using Equation 3, the sample size is
2 2
z α2 × σ 1.645 × 0.3
Sample size n = = ≈ 100
e 0.05
(b) 95% confident that our estimate of µ is off by less than 0.05.
Solution: Using Equation 3, the sample size is
2 2
z α2 × σ 1.96 × 0.3
Sample size n = = ≈ 138
e 0.05
(c) 99% confident that our estimate of µ is off by less than 0.05.
Solution: Using Equation 3, the sample size is
2 2
z α2 × σ 2.575 × 0.3
Sample size n = = ≈ 225
e 0.05
(a) 85% confident that our estimate of µ is off by less than 0.5
(b) 90% confident that our estimate of µ is off by less than 0.5
(c) 95% confident that our estimate of µ is off by less than 0.5
(d) 99% confident that our estimate of µ is off by less than 0.5
Problem 7: Estimate the lowest error rate that can be statistically established with the following
number N of (independent identically distributed) comparisons:
(a) 90% confident that the lowest error rate p, for which the probability of zero errors in 30
trials is purely by chance
Solution: Using the Rule 5, the lowest error rate p is p = 2/30 = 0.07 or 7%
(b) 90% confident that the lowest error rate p, for which the probability of zero errors in 100
trials is purely by chance
Solution: Using the Rule 5, the lowest error rate p is p = 2/100 = 0.02 or 2%
S.N. Yanushkevich, Fundamentals of Biometric System Design 38
(c) 95% confident that the lowest error rate p, for which the probability of zero errors in 30
trials is purely by chance.
Solution: Using the Rule 5, the lowest error rate p is p = 3/30 = 0.1 or 10%
(c) 95% confident that the lowest error rate p, for which the probability of zero errors in 100
trials is purely by chance
Solution: Using the Rule 5, the lowest error rate p is p = 3/100 = 0.03 or 3%
Problem 8: Using the Rule of 30, determine the true error rate in the following experiments:
(a) 1 error is observed in 30 independent trials
(b) 1 error is observed in 100 independent trials
(c) 10 error is observed in 500 independent trials
(d) 50 error is observed in 1000 independent trials
Problem 9: Suppose that a device’s performance goal is to reach 1% false non-match rate, and
a 0.1% false match rate. Using the Rule of 30, estimate the number of genuine attempt trials
and impostor attempt trials.
Solution: 30 errors at 1% false non-match rate implies a total of 3,000 genuine attempt trials,
and 30 errors at 0.1% false match rate implies a total of 30,000 impostor attempt trials. Note
that the key assumption is that these trials are independent.
Problem 10: The distances Di between feature points measured in 100 fingerprints are repre-
sented by a normally distributed, n(x; µ, σ), random variable x with the sample mean x = 71.8
(Fig. 15a). Assuming a population standard deviation of σ = 8.9, does this seem to indicate
that the mean of distances is greater than 70? Use a 0.1 level of significance.
Solution:
H0 : µ = 70
H1 : µ > 70
Standard normal
distribution
n(z;0,1)
Step 2: Critical point for α = 0.1 is z0.05 = 1.645 (from the
table); critical region is defined as z0.05 > 1.645
Step 3: Critical point for input data (x = 71.8, σ = 8.9, n = z
100, and µ = 70) 0 1.645 2.02
Di
Di
(a) (b)
Fig. 15: The distances Di between feature points measured in a fingerprint (a) and face
(b) are represented by a normally distributed, n(d; µ, σ) (Problems 10 and 11).
Problem 11: The distances Di between feature points measured in 50 facial images are repre-
sented by a normally distributed, n(x; µ, σ), random variable x with the sample mean x = 7.8
(Fig. 15b). Assuming a population standard deviation of σ = 0.5, does this seem to indicate
that the mean of distances is greater or less than 78? Use a 0.01 level of significance.
Solution:
S.N. Yanushkevich, Fundamentals of Biometric System Design 40
H0 : µ = 8
H1 : µ 6= 8
Standard normal
distribution
0.01
Step 2: Critical points for α = 2 = 0.005 are z0.005 = n(z;0,1)
Problem 12: Evaluate the performance of a system that accept at least 5 facial images of impos-
tors as belonging to the database of 100, and reject 10 faces of persons enrolled in the database.