You are on page 1of 14

Journal of Physiology - Paris 103 (2009) 4–17

Contents lists available at ScienceDirect

Journal of Physiology - Paris


journal homepage: www.elsevier.com/locate/jphysparis

On computational Gestalt detection thresholds


Rafael Grompone von Gioi a,b,*, Jérémie Jakubowicz b
a
IIE, Universidad de la República, Julio Herrera y Reissig 565, CP11300 Montevideo, Uruguay
b
CMLA, ENS Cachan, CNRS, UniverSud, 61 Avenue President Wilson, F-94230 Cachan, France

a r t i c l e i n f o a b s t r a c t

Keywords: The aim of this paper is to show some recent developments of computational Gestalt theory, as pioneered
Computational Gestalt theory by Desolneux, Moisan and Morel. The new results allow to predict much more accurately the detection
NFA thresholds. This step is unavoidable if one wants to analyze visual detection thresholds in the light of
A contrario detection
computational Gestalt theory. The paper first recalls the main elements of computational Gestalt theory.
Detection threshold
It points out a precision issue in this theory, essentially due to the use of discrete probability distribu-
Binomial distribution
tions. It then proposes to overcome this issue by using continuous probability distributions and illus-
trates it on the meaningful alignment detector of Desolneux et al.
Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction Wertheimer program into a mathematical theory involving the


bases of computer vision: image sampling and image information
Computer vision aims at providing a computational theory that measurements. This theory predicts perception thresholds which
answers the main question of vision: how to arrive at a global can be computed on every image and usually give a clear cut deci-
interpretation of a scene from the local, atomic information con- sion between what is seeable as a geometric structure (Gestalt) in
tained in an image? A successful theory would be one compatible the image and what is not.
with human vision, and that lets us predict the results of phenom- In that original publication they elaborated the theory for the
enological experiments. case of alignment detection in digital images and clearly indicated
Since 1921, a group of German psychologists led by Max Wert- how their method could scale up to a quite general framework.
heimer worked on a scientific attempt to state the laws of visual Then, their work led to various detection algorithms commonly
perception (Wertheimer, 1923). According to Gestalt theory, gathered under the labels ‘‘a contrario method” or ‘‘Helmholtz prin-
‘‘grouping” is the main process in visual perception. Whenever ciple”. For instance, one can find an edge detector (Desolneux et al.,
points (or previously formed visual objects) have one or several 2001), a vanishing point detector (Almansa et al., 2003), a stereo
characteristics in common, they are grouped and form a new, matching pairs detector (Moisan and Stival, 2004), two cluster
larger visual object, a Gestalt. Gestalt theory, however, is mainly detectors (Desolneux et al., 2003b; Cao et al., 2007), a land classi-
phenomenological; thus, predictions are only limited to special fier (Robin et al., 2005), a shape detector (Musé et al., 2006), a
cases. movement detector (Veit et al., 2006), a mode detector for histo-
At the beginning of the 1970s, as computers were able to grams (Delon et al., 2007), a digital elevation model from stereo
deal with images with some efficiency, a new discipline emerged pairs (Igual et al., 2007), and a refined alignment detector (Grom-
at the crossing point between Artificial Intelligence and Robotics: pone von Gioi et al., 2008).
Computer Vision. The aim was to create a mathematical and The theory is based on what these authors call the Helmholtz
computational theory of visual perception. Surprisingly, there principle. This principle states an obvious specification for every
was little interaction with Gestalt theory at the beginning, even low-level feature detection algorithm: it should not detect any fea-
if both disciplines attempted to answer the same questions (Marr, tures (or only a few) in white noise images. Formulated this way,
1982). this principle should be also attributed to Attneave (1954). Let us
In 2000, Desolneux, Moisan and Morel introduced a computa- quote Desolneux et al. (2008) for a definition: ‘‘According to this
tional Gestalt theory (Desolneux et al., 2000). They translated the [Attneave–Helmholtz] principle, an observed geometric structure is
perceptually ‘meaningful’ if the expectation of its occurrences (in other
terms, its number of false alarms (NFA)) is small in a random image.”
* Corresponding author. Address: IIE, Universidad de la República, Julio Herrera y This principle can be turned into a method for detecting structures
Reissig 565, CP11300 Montevideo, Uruguay. after some precise definitions for ‘‘small” and ‘‘random image”
E-mail addresses: grompone@cmla.ens-cachan.fr (R. Grompone von Gioi), have been given. The precise definition of ‘‘random image”
jakubowi@cmla.ens-cachan.fr (J. Jakubowicz).

0928-4257/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jphysparis.2009.05.002
R. Grompone von Gioi, J. Jakubowicz / Journal of Physiology - Paris 103 (2009) 4–17 5

depends on the sought structure, and we will not elaborate more 2. The Attneave–Helmholtz principle
on that, but rather refer to Desolneux et al. (2008) which contains
many examples and developments. What interests us in this paper Let us now sketch the general quantitative picture behind the
is the definition of ‘‘small.” Attneave–Helmholtz principle (a concrete example is given in the
Detections made on a random image are false detections, as next section). Assume one is looking for some given structures in
they arise by chance in non structured data. We will call NFA(e) some data. The sought structures could be segments in a digital
the number of detections made on a random image by an a contra- image, excess of points in a supposedly uniform point cloud, occur-
rio method when its only parameter is set to the value e. Actually, rences of a given word in a string, etc. The structures can be located
these methods are designed in such a way that the parameter e at various positions. They are indexed by a set denoted I. When
controls the number of false detections; more formally, the method dealing with segments in a digital image, I = {(p1, p2)} where pi
definition guarantees that NFA(e) 6 e. range through all points in the image. When dealing with point
Desolneux et al. (2000) claim that the method is ‘‘parameter- clouds, there are many choices for I: a set of circles, rectangles,
less” because the dependency on e is very weak. They add: ‘‘[. . .] polygons.
this definition leads to a parameter free method, compatible with To each possible structure location, i.e., to each i 2 I, is associ-
phenomenology”. Relying on this weak dependency, they suggest ated a number ti which encodes how good the fit is between the
to set e to the a priori value 1. Good detection results support this structure at location i and the data. It can be binary (the structure
choice. is present or it is not), integer (the number of points falling in the
Desolneux et al. set up some psycho-visual experiments to see if given area), or real. The larger the values of ti, the better the fit
human perception qualitatively matched with their framework. should be.
Synthetic images containing or not containing alignments on ran- Let us assume that there is an underlying model in which we
dom background were displayed to subjects, who were asked ‘‘Is would not like to detect any structure belonging to I, or very few
there an exceptional alignment?” For details on the protocol and of them (hence the name a contrario). This model is referred to
the results see Desolneux et al. (2003a). It turned out that good as the null hypothesis or H0. It can be Gaussian white noise in the
matches are obtained with the particular value e = 102. The agree- case of segments, Poisson point process in the case of points
ment of the observed detection curves with the predicted ones is clouds, independent and uniformly distributed letters for occur-
encouraging, while more work is needed about the exact threshold rences of words in a string.
value.1 Under the H0 hypothesis, the ti’s become random variables
The aim of this paper is to show some recent developments that which we denote by Ti. To each i 2 I, there is a natural one-sided
allow to predict much more accurately the detection thresholds. associated test: Ti P ai for some thresholds ai. When Ti P ai one
Desolneux et al. theory guarantees that NFA(e) 6 e. However it says that the shape at location i has been detected. When the data
was implicitly assumed that NFA(e)  e. In this paper we will show is actually sampled from the H0 model, each rejection i (Ti P ai) is
that due to the discrete nature of the probability distributions in- called a false detection or false alarm or false positive.
volved, NFA(e) is noticeably smaller than the a priori threshold e. Given an overall and unique threshold e > 0, the Attneave–
Its behavior can be characterized by a lower bound giving Helmholtz principle says that ai should be adjusted so as to insure
e < NFAðeÞ 6 e. It turns out that the ratio NFA is usually less than
CðeÞ e that the expected number of false alarms under H0 be less than e.
a hundredth in most of the aforementioned algorithms. Arguably, Since there are many ai and a single e there are many ways to
this quantification discrepancy explains why the choice e = 1 works achieve this adjustment. A common adjustment is the so-called
well in some applications and leads back to the 102 of the psycho- ‘‘Bonferroni correction” (see e.g., Hochberg and Tamhane, 1987).
visual experiments. This rather simple fact does not seem to have It consists in dividing up e in equal parts over all i 2 I:
been emphasized before.  
In this paper we also introduce a modification of Desolneux e
ai ¼ min x; PH0 ½T i P x 6
et al. theory that uses continuous distributions, hence making #I
NFA(e) stick to e. We will illustrate the new theory in the case of
meaningful alignments. The results are comparable to previous where #I stands for the number of elements of the set I.
ones and the choice e = 1 is still well adapted. However, the detec- For compatibility with previous papers we use a somewhat con-
tion thresholds are handled in a cleaner way that should lead to fusing notation where two quantities are named in relation with
better comparisons with psycho-visual experiments. the idea of Number of False Alarms (NFA).
This paper is organized as follows. Section 2 starts on the a con- First we have
trario decision formalism, illustrating it on a simple example, taken
NFAi ¼ #I  PH0 ½T i P t i 
from Delon et al. (2007), in Section 3. Section 4 establishes a lower

bound for the ratio NFAðe for binomial distributions, and more gen- where PH0 ½T i P ti  is the p-value associated to the observation ti.
erally for log-concave distributions. Section 5 shows numerical This quantity is associated to the location i and the observation ti.
experiments performed on several classical a contrario detectors: The detection rule becomes: NFAi 6 e. NFAi measures the meaning-
the Meaningful Alignment Detector, the Histogram Modes Detec- fulness of the observation at location i. Whenever it is a small num-
tor, the Vanishing Point Detector and the Multisegment Detector. ber, it means that the observation at location i would have had very
Its purpose is to compare their NFA(e) to the computations of few chances to appear in the a contrario random model. Conversely,
Section 4 and to confirm that it is indeed smaller than e. Section a large NFAi means that the observation at location i is very likely to
6 introduces the alignment detection problem and Section 7 shows appear randomly, thus, it is not meaningful. The threshold upon
a formulation of that problem with continuous distributions. Some which an observation is meaningful or not is precisely e, the thresh-
experiments confirm that NFA(e) then sticks to its threshold e. old on the allowed expected number of false alarms under H0
Section 8 concludes. hypothesis.
The second definition is
1
A second set of experiments were done asking ‘‘Can you see a square in this " #
image?” In that case the detection curve matched for the threshold value e = 1020. X
However, the experimenting conditions were inappropriate and the subjects were NFAðeÞ ¼ EH0 1NFAi 6e
unable to perceive the full resolution of the images. The result is not reliable enough. i
6 R. Grompone von Gioi, J. Jakubowicz / Journal of Physiology - Paris 103 (2009) 4–17

Gaussian histogram (H0 hypothesis) Non−Gaussian histogram (H1 hypothesis)

60
40

50
40
30
Frequency

Frequency
30
20

20
10

10
0

0
−4 −2 0 2 4 −4 −2 0 2 4

Fig. 1. Left: histogram of a sample of size 256 generated according to a Gaussian distribution (H0 hypothesis). Right: histogram of a sample of size 256 generated according to
a mixture of two Gaussian distributions with parameters ðl1 ¼ 0; r21 ¼ 1Þ and ðl1 ¼ 0:5; r21 ¼ :64Þ. In this case it is difficult to visually distinguish Gaussianity from non-
Gaussianity.

This second quantity is associated to the model H0 and the pre- and we denote by P½a;b the p-value of T[a,b] under the null hypoth-
viously computed thresholds and does not rely on any observed esis:
data to be computed: it is the expected number of detections on  
H0 with the parameter e. P½a;b ðx1 ; . . . ; xn Þ ¼ PH0 T ½a;b ðX 1 ; . . . ; X n Þ P T ½a;b ðx1 ; . . . ; xn Þ
Note that NFAi and NFA(e) are different concepts.
where the null hypothesis means that X1, . . . , Xn are i.i.d. Gaussian
(with expectation 0 and variance 1). We denote by NFA[a,b] the
3. An a contrario detection example quantity

Let us elaborate on Section 2 and illustrate the a contrario meth- mðm þ 1Þ


NFA½a;b ðx1 ; . . . ; xn Þ ¼ P½a;b ðx1 ; . . . ; xn Þ
od in the light of a simple example. This example is taken from De- 2
lon et al. (2007) where a general test – designed to check the The factor mðmþ1Þ corresponds to the number of intervals, and
2
adequacy of a histogram to any given distribution – is built upon thus to the number of tests performed, since to each interval is
the a contrario method. In Delon et al. (2007), this test itself is used associated a single test. Thus NFA[a,b](x1, . . . , xn) is the p-value of
as a building block to construct a mode detector for histograms that T[a,b](x1, . . . , xn) renormalized to take into account the fact that
we are going to deal with later on. The example we work on in this many tests were performed. A small value of NFA[a,b] means that,
section is a simplified version of the adequacy test of Delon et al. assuming the null hypothesis, the value of T[a,b] is very unlikely.
Let us assume we want to decide whether or not a given sample If e is a fixed threshold, the expected number of intervals [a,b] such
has been drawn from the standard centered Gaussian distribution that NFA[a,b] is less than e under H0 is itself less than e. Formally:
(null hypothesis H0), see Fig. 1. Of course, this central question has
2 3
already been addressed many times in the literature and there ex- X
ist some efficient algorithms to tackle it, the choice of which NFAðeÞ ¼ E4 1NFA½a;b ðX 1 ;...;X n Þ6e 5 6 e
mainly depends on the sample size, see for instance, the popular ½a;b

Kolmogorov–Smirnov (e.g., Shao, 2003) and Shapiro–Wilk tests


The left hand expectation is the expected number of false
(cf. Shapiro and Wilk, 1965) among others. We shall see that the
alarms (or false positives) and is denoted by NFA(e) without any
a contrario method is able to address more than the mere question
subscript because it is associated to the method and not to any par-
of the sample Gaussianity. Indeed, when the sample is non-Gauss-
ticular interval. The dependence on the sample (x1, . . . , xn) is often
ian, the a contrario method tells where it does fail to be Gaussian.
omitted in the notations, giving T ½a;b ; P½a;b and NFA[a,b].
If one wants to use the a contrario method in this case, one has
to choose a family of tests, independent or not under the null hypoth-
esis (and usually not), that can help spot out the non-Gaussianity of 3.2. The non-Gaussian subinterval detector
the sample. If the sample is denoted by x1, . . . , xn and the real line is
cut into m bins, I1, . . . , Im, it is natural to count the number of points According to the definitions, T[a,b] follows a binomial distribu-
falling into each union of consecutive bins I[a,b] = Ia [ Ia+1 [    [ Ib. tion of parameters n and p ¼ P½X 2 I½a;b  where X is a standard
It is important not to restrict ourselves to individual bins since Gaussian random variable. Hence,
there can be long range small deviations from Gaussianity that X n
may not be captured by single bin statistics. P½a;b ¼ pk ð1  pÞnk
kPT
k
½a;b

3.1. Notation and


 
We denote by T[a,b](x1, . . . , xn) the number of points of the sample mðm þ 1Þ X n k
falling into I[a,b]: NFA½a;b ¼  p ð1  pÞnk
2 kPT
k
½a;b

X
n
T ½a;b ðx1 ; . . . ; xn Þ ¼ 1I½a;b ðxi Þ The algorithm is simple. For each I[a,b], compute NFA[a,b], if
i¼1 NFA[a,b] 6 e, reject the H0 hypothesis for I[a,b] (see Algorithm 1).
R. Grompone von Gioi, J. Jakubowicz / Journal of Physiology - Paris 103 (2009) 4–17 7

Algorithm 1: Non-Gaussian subinterval detector


input: A sample x1, . . ., xn
output: A list of non-Gaussian subintervals
1 Reduce and center the sample;
2 Divide the real axis into m bins; Fig. 3. Gaussianity test level. The level seems to increase with the sample size.
Obviously, it is smaller than the average number of false alarms for the same
3 Compute pa ¼ P½X 2 Ia  for all a 2 {1, . . . , m} and X standard
sample size because one sample can lead to many false alarms (since each non-
Gaussian Gaussian subinterval leads to one false alarm).
4 Set e = 1;
5 foreach it union of bins [a,b] do
standard deviation figures have already hinted at the fact that
6 compute NFA[a,b];
the false alarms often come in cluster. The tests are far from being
7 if NFA[a,b] < e then
independent since intervals overlap. So when a random fluctuation
8 Add [a,b] to the output list
leads to an excess of points in an interval; all intervals – either
9 end
contained in it, or containing it – are more likely to trigger a false
10 end
alarm too.
We also made one experiment to compute an approximation of
this test power and compared it to the popular Kolmogorov–Smir-
Let us check numerically that under the Gaussian assumption,
nov test. The H1 model used was a mixture of two Gaussians. One
the expected number of false alarms of the previous algorithm is
Gaussian had parameters ðl1 ¼ 0; r21 ¼ 1Þ and the other one had
indeed less than e. To apply the previous algorithm one first
parameters ðl2 ¼ 0:5; r22 ¼ :64Þ. The mixture was Y = X  Z1 +
needs to choose the bins and e (we always choose e = 1 if not
(1  X)  Z2 where Zi denotes the two mentioned Gaussian and X
explicitly stated). For the bins, we chose ð½5 þ 2k ; 5 þ kþ1
2
Þ for
is a Bernoulli random variable with parameter .5 (fair toss). See
k 2 [0, 19], along with (1, 5) and (5, 1). We generated 1000
Fig. 1.
samples of size n for n = 10, 100 and 1000 according to the
We used the same settings as in the previous experiments and
standard Gaussian and counted the average number of false
adjusted the level of the KS test so that it corresponds to the one
alarms we got for each sample, NFA(e). The results are shown
computed in Fig. 3. The comparison results are presented in
in Fig. 2.
Fig. 4. One can see that when n = 10 the power is barely greater
One can see that one is indeed an upper bound for the average
than the level .05 which means that the samples are hardly distinct
number of false alarms whatever n. The values of NFA(e) are smal-
from Gaussian ones. One can also notice, quite expectedly, that the
ler when n is small and seem to increase with n. One can also no-
larger the sample, the better the power. The figures also prove that
tice that the standard deviations are quite large compared to the
the KS test performs better than this a contrario detector for testing
averages.
Gaussianity. However, as already pointed out, the a contrario detec-
tor conveys more information than a simple test like KS. It is able
3.3. Testing Gaussianity to detect each non-gaussian interval, consequently locating where
the sample fails to be Gaussian.
The first motivation for this section was not to find the non- The purpose of this example was to show that the 1-threshold
Gaussian subintervals but to test whether or not the whole given on NFAi can lead to substantially less than one false alarm (in
sample was Gaussian. There is an obvious way to go from subinter- expectation) under the null hypothesis (Gaussian in the previous
vals to the whole sample: declare that the sample is non-Gaussian example). For instance for a sample size equal to 10, the average
if it contains a non-Gaussian subinterval. number of false alarms after 1000 experiments is only 0.15. And
We kept the previous experiment settings (same bins, same it is even smaller for the global Gaussianity test. Again for a sample
number of samples) to compute the approximate level of this test size equal to 10, the level drops to 0.05 which is a common level.
according to the sample size. Let us recall that the level of a test is With an NFA(e) really equal to one, the previous test would have
the probability of mistakenly reject the H0 hypothesis, whereas the been totally useless: the null hypothesis being rejected far too of-
power of a test is the probability of correctly accepting an alterna- ten. The Gaussianity test of this section is mainly a toy example,
tive hypothesis H1. For the power to be defined, an H1 hypothesis considering we dropped the non-gaussian subintervals location
must have been specified, and it is not the case in a contrario information, summing up all the detected subintervals as a binary
framework. The results are shown in Fig. 3. One can notice that cue: the non-Gaussianity of the whole sample. As it is going to ap-
the level values are expectedly less than the average number of pear in Section 5, the a contrario method is primarily interested in
false alarms. Indeed, one sample can lead to many false alarms the detections themselves rather than globally accepting or reject-
(as many as the detected non-Gaussian intervals). The previous ing the H0 hypothesis.

Fig. 2. Average number of false alarms computed on 1000 independent samples


with bins of size .5 between 5 and +5. One can notice that this average number is Fig. 4. Power comparison between the a contrario non-Gaussianity detector and the
indeed less than one. One can also notice that this number seems to increase with KS test, both tests levels being equal. The H1 hypothesis is a mixture of two
the sample size. The standard deviation of the number of false alarms is also Gaussian distributions with parameters ðl1 ¼ 0; r21 ¼ 1Þ and ðl2 ¼ 0:5; r22 ¼ :64Þ. It
computed. It is significantly larger than the corresponding average. The following appears that KS is more powerful than the a contrario test for discriminating
sections are going to explain these phenomena. Gaussianity, the levels being equal.
8 R. Grompone von Gioi, J. Jakubowicz / Journal of Physiology - Paris 103 (2009) 4–17


In the following section, we will try to estimate the ratio NFAð
e . It 1.0
has already been mentioned that the following inequality holds
true:
NFAðeÞ 6 e 0.8 FTi
We now look for a lower bound:
e
NFAðeÞ P 0.6
C
with C greater than one and possibly depending on e. u

4. NFA(e) lower bound 0.4

4.1. The discrete nature of the underlying tests P(FTi(Ti) ≤ u)


0.2
It is the discrete nature of the binomial distribution that is
responsible for the gap between NFA(e) and e. Since what follows
only concerns the null hypothesis, we assume that the null hypoth-
esis holds without explicitly stating it anymore. 0.0
We recall the setting of the previous section in a more abstract Ti
way. A family of tests Ti with i 2 I is given. Each Ti follows a bino-
mial distribution with parameters ni and pi (in the example of Sec- Fig. 5. When Ti follows a discrete distribution, P½ST i ðT i Þ 6 u can be strictly less than
u. Indeed, ST i is a stepwise constant function as Ti has a discrete probability
tion 3, i is the bin interval [a,b], pi is the probability that a standard
distribution.
Gaussian belongs to this interval and ni is constantly equal to the
sample length n). NFAi denotes the p-value of the test Ti, times
the total number of tests #I. NFA(e) denotes the expectation of
SX ði þ 1Þ P½X ¼ i
the number of false alarms. ¼1 ¼ 1  hX ðiÞ
SX ðiÞ P½X P i
Let us recall the following elementary fact.
where hX denotes what is called the ‘‘hazard function” in survival
Proposition 1. If X denotes a random variable, and SX its associated
analysis (Elandt-Johnson and Johnson, 1980). The following propo-
survival function (i.e., SX ðxÞ ¼ P½X P x, Elandt-Johnson and Johnson
sitions hold true.
(1980)),
P½SX ðXÞ 6 u 6 u Proposition 2. If X follows a binomial distribution, then hX is non-
decreasing.
Moreover, if SX is continuous then the previous inequality is in fact an
equality. This result is elementary and its proof is given in Appendix A. It
can be embedded into a more general framework, which helps
This proposition shows that if each Ti has an absolutely contin- understanding what really matters to extend the conclusion of this
uous distribution, then rejecting each test Ti when NFAi 6 e leads to section to more general distributions than binomial distributions.
NFA(e) = e. Indeed, If X had been absolutely continuous the ratio P½XPiP½X¼i
would have ta-
" # f ðxÞ 0

X X X  e
ken the form  FðxÞ with f = F . Thence showing that hX is increas-
NFAðeÞ ¼ E 1NFAi 6e ¼ P½NFAi 6 e ¼ P ST i ðT i Þ 6 ing would amount to show that log F is concave. Such distributions
i2I i2I i2I
#I are called log-concave (Bagnoli and Bergstrom, 2005) (actually log-
X e concavity of a distribution usually refers to its p.d.f., not its c.d.f.,
¼ ¼e
i2I
#I but then the log-concavity of the c.d.f. is a consequence of the
log-concavity of the p.d.f.). The theory of log-concave distributions
The key quantity to estimate is thence: extends to the discrete case provided the derivative operator D be
P½SX ðXÞ 6 u replaced by the difference operator D : ðun Þ 2 RN #ðun  un1 Þ 2 RN .
u For instance, a sequence u is said to be concave when D2(u) 6 0
(please refer to for more on log-concavity (Stanley, 1989)).
When Ti follows a binomial distribution this ratio may be less
What Proposition 2 shows is that the smallest non-null ratio
than one. More specifically, P½ST i ðT i Þ 6 u is a stepwise constant SX ðiþ1Þ
SX ðiÞ
is reached by the largest i such that SX(i + 1) is non-null.
function in u. This function jumps down each time u crosses a value
ST i ðkÞ with k such that P½T i ¼ k > 0. Since, SX(k) are decreasing in k Proposition 3. Assume the family of tests satisfies:
whatever the probability distribution of X; given u, it is always pos-
sible to find k such that ST i ðk þ 1Þ 6 u < ST i ðkÞ. For this particular (1) Each test Ti follows a binomial distribution with parameters ni
choice of u and k, one has P½ST i ðT i Þ 6 u ¼ ST i ðk þ 1Þ. See Fig. 5 for and pi,
an illustration. (2) Ti is rejected when NFAi 6 e
P½ST ðT i Þ6u ST ðkþ1Þ eg
Controlling the ratio i
then boils down to estimate SiT ðkÞ . (3) ji;e ¼ maxfj 2 ½0; ni  : ST i ðjÞ > #I
u
i

Then
4.2. Hazard function
1 X
The previous subsection showed that the ratios
1  hT i ðji;e Þ e < NFAðeÞ 6 e
#I i2I
SX ði þ 1Þ
Using Propositions 2 and 3, we get the following two corollaries.
SX ðiÞ
The first corollary is specific to binomial distributions, while the
are quantities of interest. SX denotes the survival function P½X P second, weaker, could be easily generalized to log-concave
x; X follows a binomial distribution and i is an integer. For all i, distributions.
R. Grompone von Gioi, J. Jakubowicz / Journal of Physiology - Paris 103 (2009) 4–17 9

P
Corollary 4. Let C ¼ #I= i2I:pni 6 e 1pi ji;1e þ1 . With the same assump- size. Indeed, the only difference between the two bounds lies in the
pi ni ji;e þ1
i #I
tions as in Proposition 3, place the hazard function is evaluated at. The crude bounds always
e assumes the worst case: ji,e = n  1 where n is the sample size. This
< NFAðeÞ 6 e suggests that the most important effect, when n gets larger, is to
C
bring ji,e closer to the center of the Ti distribution where the hazard
P function gets smaller. More precisely, the following proposition is
Corollary 5. Let C ¼ #I= i2I:pni 6 e ð1pipÞni i þpi . With the same assump-
#I
tions as in Proposition 3,
i
straightforward to show:
Proposition 6. h(ji,e) tends to 0 when n tends to +1.
e
< NFAðeÞ 6 e
C
Proof. From the central limit theorem, it is clear that
5. Experiments
 
n k
p ð1  pÞnk
k
In this section, we compare the lower bound we have just com-
puted with the NFA obtained by simulation. The proof of Corollary converges to 0 when n goes to +1 uniformly in k. So denoting by
5 contains a rather crude step: bounding from above hT i ðji;e Þ by An,p the maximum
hT i ðni  1Þ. Experiments of this section are going tell how crude this  
n k
step really is. max p ð1  pÞnk
First we go back to the introductory example. Then, we investi-
k k
gate known algorithms: the histogram mode detector, the vanish- it leads to An,p going to 0 when n goes to +1. But, by definition,
ing points detector, and the multisegment detector. The analysis of
An;p
the alignment detector is postponed to the next section. The goal of hðji;e Þ 6 
these experiments is to show that these detectors have a NFA(e) e  An;p
noticeably less than one when e = 1. This explains why, when the sample size gets larger, NFA(e)
goes increasing. It also predicts, that for very large n, NFA(e) is
5.1. Analysis of the first example going to tend to e. Going back to the table of Fig. 7, it appears that
the first entry is not satisfying. The NFA(e) value was about 0.15 in
Let us compare the lower bound given by Corollary 5 to NFA(e). our experiments and the predicted lower bound is only 0.02. Even
The table of Fig. 6 sums up the results. if the lower bound is indeed smaller than the corresponding
One can see that the lower bound is indeed always smaller than NFA(e), it is not a very good approximation of it. There are at least
NFA(e). But in this case it is not a very good approximation in the two reasons that could explain the bad quality of approximation of
sense that it decreases with the sample size whereas NFA(e) the lower bound for small e. First, going back to the approxima-
increases. tions we made, we estimated the worst ratio: F(ji,e)/F(ji,e + 1). But
Let us try the lower bound obtained in Proposition 3, that does nothing prevents the threshold #I e to fall in the middle of the inter-
not use the fact that h is an increasing function, i.e., the fact that val (F(ji,e + 1),F(ji,e)), hence giving a less pessimistic ratio than ex-
binomial distributions are log-concave distributions. Fig. 7 sums pected. This is especially true when F(ji,e + 1) and F(ji,e) are far
up the results. This lower bound gives better results and is still from each other, which is the case when e is small. Second, there
smaller than NFA(e) (which is a sanity check). It is increasing with is a stronger variance effect for small values of e like in Fig. 2 where
the sample size just as NFA(e) itself and seems to be closer to std
the ratio mean becomes large when the mean is small.
NFA(e) when the sample size gets larger.
The fact that this bound increases, and the crude one does not, However, the lower bound has a good level of approximation
gives a clue of why NFA(e) is an increasing function of the sample for bigger n. Moreover, it is increasing with n just as NFA(e) itself.
Let us conclude this subsection with a more qualitative analysis.
What showed this analysis, is that the difference between NFA(e)
and e is a quantification effect due to the fact that the underlying
statistics are discrete.
P½X¼x
The geometric distribution is characterized by the fact that P½XPx
is constant. It is the well known ‘‘memoryless” effect of exponen-
tial/geometric distributions. Now there are some distributions for
which the probability of exceeding a large enough threshold x is
more or less equivalent to being equal to it. In terms a survival
Fig. 6. Comparison of the lower bound of Corollary 5 and NFA(e). One can see that analysis, it traduces an ‘‘aging” effect (and it is precisely the mean-
the simulations are coherent with the computations in the sense that NFA(e) is ing of log-concavity). For this kind of distributions, to which be-
bigger than its lower bound. The larger the sample size the further the lower bound
longs the binomial distribution, variations between P½X P x and
diverges from NFA(e). In particular, one can see that the computed lower bound is
unable to grasp the growth of NFA(e) with the sample size.
P½X P x þ 1 may be important as x becomes large. And this quan-
tification effect vanishes as the binomial distribution starts looking
like a continuous one, i.e., when n grows to +1.

5.2. Other algorithms based on NFA computations

Without analyzing in details the following algorithms, we want


to emphasize that most of them share the property of having an ac-
tual NFA(e) notably less than e at least when e = 1. However, their
analysis is more involved than the previous ones because the dis-
Fig. 7. Lower bound computed from Proposition 3. One can see that this bound is
better than the previous one. In particular, it is able to capture the increasing nature tributions of tests either depend on the data or follow more com-
of NFA(e) with the sample size. plex distributions than the binomial.
10 R. Grompone von Gioi, J. Jakubowicz / Journal of Physiology - Paris 103 (2009) 4–17

5.2.1. Histogram mode detector


The Delon et al. histogram mode detector is designed to detect
peaks in histograms in a completely non-supervised way. It is
based on the Grenander estimator (Grenander, 1956) and on a
non-parametric adequacy test generalizing the example of Section
3 (see Delon et al., 2007). What makes it very different from the
previous examples is that the test family is not given a priori but
adapts to the data.
Even if we did not carry out its quantitative analysis, it is still
possible to compare the NFA(e) of this detector to the a priori
threshold e. We generated samples of size n drawn from a uniform
distribution in [0, 100] and we detected peaks in their frequency
histogram (see Fig. 8). Each detected peak is a false alarm. We tried
various histogram sizes n = 16, 64, 256 and 1024. The results of this

experiment are shown in Fig. 9. The ratios NFAð e have an order of
magnitude 103 for the standard value e = 1. Then, they increase
rapidly until e is 102 to reach an order of magnitude 101. They
are increasing with the sample size n. Fig. 9. Average number of false alarms for histograms drawn from a uniform
distribution in [0, M] over 1000 histograms of size n. (a) n = 16. (b) n = 64. (c)

5.2.2. Vanishing points detector n = 256. (d) n = 1024. The ratio NFAð
e is increasing with the size n of the sample.
The vanishing points detector is built on the alignment detector
(Almansa et al., 2003). It takes the detected alignments as input
and tests if some segments meet in the same region of space. less than one. It still corresponds to a bell-shaped curve. Again,
Fig. 10 shows an example of this algorithm when used on a natural for the standard e = 1 setting, NFA(e) is much smaller than one.
image. As in the histogram mode detector, the tests depend on the
data (they depend on the detected alignments). As in the histogram 5.2.3. Multisegment detector
mode detector, it makes the analysis difficult but it is still possible The last algorithm we study is a refinement of the alignment
to compare NFA(e) to its threshold e for white noise images. detector that is designed to detect collinear alignments called mul-
We simulated 1000 white noise images of size 64 and tisegments (Grompone von Gioi et al., 2008). The tests do not de-
128  128 and computed their vanishing points using the detector pend on the data but the distribution they follow are more
of Almansa et al. Fig. 11 shows the results we obtained. The ratio complicated than binomial distributions. The analytical analysis
NFAðeÞ
e depends very little on the image size. And it is always much of the multisegment detector is therefore difficult to write down.

Histogram of a uniform sample of size 16 in [0,100] Histogram of a uniform sample of size 64 in [0,100]
2.0

4
1.5

3
Frequency

Frequency
1.0

2
0.5

1
0.0

0 20 40 60 80 100 0 20 40 60 80 100
ech ech

Histogram of a uniform sample of size 256 in [0,100] Histogram of a uniform sample of size 1024 in [0,100]
7
6

15
5
Frequency

Frequency
4

10
3
2

5
1
0

0 20 40 60 80 100 0 20 40 60 80 100
ech ech

Fig. 8. Examples of histograms from the H0 hypothesis for various sizes. All shown histograms are drawn from uniform and identically distributed samples. According to the
Attneave–Helmholtz principle, no modes should be detected.
R. Grompone von Gioi, J. Jakubowicz / Journal of Physiology - Paris 103 (2009) 4–17 11

Fig. 10. Vanishing points. Up left: a building image. Up right: first vanishing point. Down left: second vanishing point. Down right: third vanishing point.


to the sequence size. The ratio NFAðe seems to decrease with the
sample size and to increase with e. The effect is due to the fact that
the multisegment detector gives the best multisegment for each
binary sequence, and when that one is meaningful there are many
other meaningful configurations masked by the best one. The situ-

Fig. 11. Ratios NFAð
e associated to the vanishing point detector obtained through ation is similar to the passage from non-Gaussian interval detector
1000 images of size n  n with n = 128.
to global Gaussianity test.

As for the two previous algorithms, we will restrict ourselves to 6. Alignment detection according to Desolneux et al.
generate samples from a null hypothesis and compare NFA(e) to
its threshold e. Let us now turn to the analysis of the alignment detector of Des-
The core of the multisegment detector segments binary se- olneux et al. The goal is again to compare NFA(e) to e.
quences into subintervals. Thus, we simulated 1000 binary se-
quences of size n from an i.i.d. Bernoulli model and counted the 6.1. Preliminaries
number of subintervals detected. Each detected subinterval counts
for one false alarm. The results of this simulation are shown in In order to link what follows to the general formalism of Section
Fig. 12. It appears that, again, for the standard e = 1 choice, the 2 we need to define the a contrario model, the way digital segments
NFA(e) is between 100 and 1000 times smaller than e according are defined (index set I, cues ti, H0 and NFAi).


Fig. 12. Evaluation of the ratio NFAð
e by simulations. Thousands samples distributed according to a Bernoulli distribution with parameter p = 1/16 are tested using the

multisegment detector. The ratio NFAðe is increasing with e and decreasing with the sample size. Its values for the standard e = 1 choice are small compared to one.
12 R. Grompone von Gioi, J. Jakubowicz / Journal of Physiology - Paris 103 (2009) 4–17

Proposition 7 Desolneux et al. (2008). Using the same notations


than in Section 4, let us assume that ji,e < ni. Then

θ
ji0 ;e ji;e

ts
8i; i0 ni0 P ni )

in
6
ni0 ni

po
d
ne
This proposition shows that if n0 is such that

ig
al
4
n0 ¼ maxfni : ji;e ¼ ni  1g

then

Fig. 13. Left: one segment shown over the level line orientation field (orthogonal to 1  p ji;e þ 1 1  p
6 n0
the gradient orientation field). Right: the number of aligned points up to an angular p ni  ji;e p
tolerance h are counted for each segment. The segment shown has four aligned
points among seven. and Corollary 4 becomes
e
6 NFAðeÞ 6 e
C
The set I of all possible segments is simply the set of all couples
where 1=C ¼ 1p 1 .
of pixels, a segment being indexed by its ends. Formally, if C stands p n0 þ1

for the image grid ½1; N2  N2 ; I ¼ C2 . Then the quality measure ti of For a N  N image, the smallest meaningful segment (such that
segment i is measured through the level lines: the segment is reg- ji,e = ni  1) has length
ularly sampled and at each point in the sample one measures the & ’
angle between the level line at this point and the segment itself.
log Ne4
n0 ¼
Intuitively, if segment i is noticeable in the image, most of image log p
points belonging to it should have a level line that roughly follows
For example, when N = 512 and p = 1/16, n0 = 9.
the segment direction. The way Desolneux et al. measure the qual-
We can now compare the previous ratios of Fig. 15 with 1/C (see
ity ti of segment i is then simply to count the number of aligned
Fig. 16). Put aside e = 1 and e = 10, the computed lower bound is in-
points in segment i (a point is said p-aligned with a segment i if
deed a lower bound. One can observe that the computed lower
the level line at its position has the same orientation than the seg-
bound and NFA(e) are of the same order of magnitude. Moreover
ment i up to an angular precision p). Fig. 13 illustrates the way ti is
the lower bound follows the same trend than NFA(e). Our interpre-
computed. For a segment i of length ni, this gives a binary sequence
tation for the anomalies observed when e = 1 or e = 10 is linked to a
ðx1 ; . . . ; xni Þ where each xk tells whether or not the kth point is p-
fact already pointed out when we were dealing with Gaussian
aligned with i. If all the xk are marked up aligned (let us say 0 means
detection. First, the lower bound is derived from a worst case anal-
p-aligned and one not p-aligned) the segment is a perfect segment
P ysis which may be too pessimistic for small e. Second, the variance
with maximum quality measure: t i ¼ k 1ðxk ¼ 0Þ.
is larger for small NFA(e) which makes the figures less reliable.
Given this measurement sequence ðx1 ; . . . ; xni Þ associated to seg-
A natural question is to determine the dependence of 1/C on the
ment i one should decide if it is a valid detection for a given image
two parameters, N (the image size) and p. Using the expression
size. The general Attneave–Helmholtz principle (see Section 2)
suggests to use the NFA criterion NFAi ¼ N 4  PH0 ½T i P ti  for an 1 1
¼
N  N image. The chosen H0 is a Gaussian white noise, it has the C 1p logNe
4

property that each Ti follows a binomial distribution of parameter p log1p


þ1
ni and p. The null hypothesis corresponds to a situation where no
information is actually conveyed in the image. Fig. 14 shows the it appears that when N grows, 1/C decreases slowly (due to the log
result of the alignment detector when run on a white noise image term). When p decreases, the term log(1/p) tends to make 1/C in-
(size 256  256) and a natural image. crease while the term (1  p)/p tends to make it decrease, the latter
Notice that although each test follows a binomial distribution, having a more important effect because of the log in the former. So
the situation is not exactly the same as in the example of Section 1/C decreases when N increases or when p decreases. For instance,
3. In the case of alignments, the parameter p is constant and the when N = 256 and p = 1/16 and e = 1, one has C1 ¼ 0:009 instead of
parameter n depends on i whereas in Section 3 it is the opposite. 0.007 when N = 512. And when N = 512 but p = 1/32 with still
The set I is usually much bigger than in the previous case since it e = 1, then 1/C = 0.004. Simulations for images of size 64 
is of size N4 for an image with N  N pixels. 64,128  128 and 256  256 tend to confirm the analysis except
for small values of e for which the previous restraints hold (see
6.2. Experiments Fig. 17).

The first experiment consists in simulating images of size 7. NFA with continuous distributions
512  512 according to the null hypothesis H0 and then compare
the NFA(e) they lead to for various a priori thresholds e. Fig. 15 We saw in the previous sections that the binomial distribution
sums up the results. As predicted by the theory, NFA(e) is less than was responsible for the discrepancy between NFA(e) and e. This
e. In this case it is approximately a hundred times smaller than e. discrepancy should not be considered harmful to the detector itself
eÞ because it is robust to the choice of e as shown in Fig. 18. But when
And the ratio NFAðe seems to be increasing slowly with e.
It seems natural to compare the ratio of Fig. 15 with the value C1 trying to compare human vision thresholds with the ones used by
predicted by Corollary 4 of Section 4. To be able to estimate C one the algorithm, this discrepancy can cause a significant loss of accu-

must have some knowledge about 1p
ji;e þ1
p ni ji;e
because it is a burden to racy. We saw in the previous section that the ratio NFAð e could be as
3
compute ji,e numerically (as opposed to what was done for the small as 10 .
introductory example analysis) – the size of I being prohibitively It is thence natural to try using continuous statistics instead of
large – we use a concavity argument that can be found among the binomial ones. Instead of dealing with binary sequences
many others results in Desolneux et al. (2008). ðx1 ; . . . ; xni Þ as in the alignment detector of Desolneux et al. where
R. Grompone von Gioi, J. Jakubowicz / Journal of Physiology - Paris 103 (2009) 4–17 13

Fig. 14. Upper left: a white noise image. Upper right: no detected alignments. Bottom left: a building image. Bottom right: the detected alignments. The same detections
thresholds, derived from Attneave–Helmholtz principle, are used in both images. In the white noise image, no segments are detected, as specified by Attneave–Helmholtz
principle while for images that presents linear structures, many segments are detected.


Fig. 15. Comparison between e and NFA(e) for 1000 images of size 512  512 drawn from a Gaussian white noise distribution. More than NFA(e) itself, it is the ratio NFAð e that
is the most interesting quantity for us. Once again, as predicted by the theory, this ratio is less than one, with an order of magnitude 102. It seems that the ratio is slowly
increasing with e.


Fig. 16. Comparison between NFAð
e and the factor 1/C in the lower bound derived from Corollary 4. Except for e = 1 and e = 10 the computed lower bound is indeed a lower
bound. Orders of magnitude are quite faithful.

Fig. 17. Average number of false alarms for the alignment detector under the null hypothesis over 1000 white noise images. (a) Image size 64  64, p = 1/16. (b) Image size

128  128. (c) Image size 256  256. NFAð
e is decreasing with the image size.
14 R. Grompone von Gioi, J. Jakubowicz / Journal of Physiology - Paris 103 (2009) 4–17

Fig. 18. Robustness of the alignment detector of Desolneux et al. to e. (a) Original image. (b) Detected segments with e = 102. (c) Detected segments with e = 1. (d) Detected
segments with e = 100.

xi coded the fact that a point could be p-aligned or not, we keep the
x3
level line angle information as is. Thus, the continuous formulation
does not have the angular precision parameter p. We normalize the
angles between 0 and 1, so x1, . . . , xni take values in [0, 1]. The space
of possible configurations defines an hypercube of n dimensions
and side 1, see Fig. 19.
The perfect segment still has the configuration (0, 0, . . . , 0). The
worst configuration is (1, 1, . . . , 1). But now, instead of restricting
ourselves to the vertices of the hypercube as in the Desolneux
et al. detector, we consider all the points inside. In between a fron-
tier must be defined to decide which configurations are valid and 0 x2
which are not.
There are several possible choices for the frontier family but
once a frontier family is chosen, the decision criterion is given by x1
the Attneave–Helmholtz principle. The Gaussian white noise
Fig. 20. One frontier of the family x1x2    xn = a shown in the hypercube of possible
assumption leads to an H0 model where the level-line angles are configurations.
independent. As a result, in H0 the hypercube of configurations
has uniform distribution. The p-value of the event is obtained inte-
grating the uniform distribution on the hypercube under the corre- small enough, one aligned point will also lead to a meaningful seg-
sponding frontier. According to the Attneave–Helmholtz principle, ment. On the other hand, xi = 0 should never happen, except on
if there are #I candidate line segments in the image, then the prob- synthetic images. Unfortunately, due to the discretization proce-
ability of accepting a configuration under the H0 hypothesis (which dures this value can appear in practice. This exception can be easily
corresponds to the volume delimited by the frontier since the cube but not elegantly handled by noting that there is a minimum
is equipped with the uniform probability measure under the H0 attainable precision in the measurement, so xi can never be zero.
assumption) should be #I e , where e is the NFA threshold.
i Following Section 2, we need to compute
The selection of the frontier family needs some discussion. A
NFAi ¼ #I  P½T i P t i 
simple idea is to use the family x1 þ x2 þ    þ xni ¼ a, but this is
not a good choice since it forces all points to have a small angle As Xk is uniformly distributed under H0, log Xi follows an expo-
xi for the segment to become meaningful. Among all possible fron- nential distribution under H0 and Ti a C distribution with parame-
tier families we chose the simple one determined by x1  x2    ters (ni, 1). There is a simple closed form for the associated
P
xni ¼ a (see Fig. 20). It corresponds to ti ¼ k f ðxk Þ for f = log. This cumulative distribution function P½T i P x,
kind of formulation but with a different class of f functions was
P½T i P x ¼ expðxÞen1 ðxÞ
already proposed by Igual et al. (2007).
2 k
This frontier family has some drawbacks. If one point happens where ek ðzÞ ¼ 1 þ z þ z2 þ    þ zk! .
to be perfectly aligned, xi = 0, the segment will be meaningful
Proof. The density of a C distribution of parameters (n,1) has the
regardless of the angle direction of the other points. This problem t n1
form, ðn1Þ! et . Hence
is not new. In the binomial formulation, if the value of p is chosen
Z þ1
1
P½T i P x ¼ t n1 et dt
ðn  1Þ!
x3 x

Integration by parts leads to the sought formula. h


So NFAi has the closed form
NFAi ¼ #I  expðt i Þen1 ðti Þ
This gives a simple decision criteria: segment i with configura-
tion x1 ; . . . ; xni is meaningful if and only if
 X 
#I  x1    xni  en1  log xk 6 e
0 x2
7.1. Experiments with continuous NFA
x1
The first experiment set shows that the continuous formulation
Fig. 19. Hypercube of possible configurations. for NFA effectively controls the number of false detections, see
R. Grompone von Gioi, J. Jakubowicz / Journal of Physiology - Paris 103 (2009) 4–17 15

Fig. 21. Average Number of False Alarms for the alignment detector using continuous NFA under the null hypothesis over 1000 white noise images. (a) Image size 32  32. (b)
Image size 64  64.

Fig. 22. Left: original image. Middle: segments found using the classical NFA criterion in an exhaustive search. Right: segments found using the continuous NFA criterion in
an exhaustive search.

Fig. 23. Left: original image. Middle: segments found using the classical NFA criteria in an exhaustive search. Right: segments found using the continuous NFA criteria in an
exhaustive search.
16 R. Grompone von Gioi, J. Jakubowicz / Journal of Physiology - Paris 103 (2009) 4–17

Fig. 21. The table shows the number of detections obtained on denotes the family of tests and e a positive number. Let us denote
white noise images for different e values. One can see how NFA(e) by ji,e the largest integer such that
approaches e.
#I  ST i ðji;e Þ > e ð2Þ
The continuous NFA formulation is also useful in practical
detection algorithms. Here we show some experiments with the Thus,
alignment detector in digital images.
8j 2 ½0; n #I  ST i ðjÞ 6 e () j P ji;e þ 1 ð3Þ
Fig. 22 shows a comparative detections on a image using both
formulations. This is an exhaustive experiment and all the seg- The equivalence (3) implies that
ments with NFAi < 1 are shown for both formulations. As one can
P½NFAi 6 e ¼ ST i ðji;e þ 1Þ ð4Þ
see both detections are similar.
Fig. 23 shows another set of comparative experiments done And together, the Eqs. (2) and (4) lead to
using an heuristic to accelerate the result and exclusion principle,
(see Desolneux et al., 2008). Again, the result are different but very e ST i ðji;e þ 1Þ
P½NFAi 6 e > 
similar. #I ST i ðji;e Þ

which in turn gives


8. Conclusion
e
P½NFAi 6 e > 1  hT i ðji;e Þ
#I
In this paper, we saw that, due to quantification, the threshold e
put on the NFAi of some a contrario algorithms might be far from where hT i is the hazard function of associated to Ti. Taking the sum
attained. It seems that this simple fact was not stated before. over I concludes the proof,

We also provided some estimations for the ratio NFAðe when the 1 X
tests follow some binomial distributions. These estimations ap- NFAðeÞ > 1  hT i ðji;e Þ  e 
#I i2I
peared to be quite faithful to the experimental data we simulated.
To be precise enough to estimate perception thresholds, we
showed that the first step was to replace every discrete distribu- A.2. Proof of Corollary 4
tion involved in the NFA computation by a continuous one. We
showed and analyzed a way to do so when dealing with alignment
detection. Proof. Eq. (1), the fact that f is an increasing function and the fact
that 0 6 h 6 1 give, when ji,e 6 ni  1,
Acknowledgements !
ji;e þ 1
hðji;e Þ 6 f
We thank Jean-Michel Morel and Gregory Randall for valuable
ni  ji;e
conversations and suggestions. The research was partially financed The wished result is then a consequence of Proposition 3 and
by the ALFA project CVFA II-0366-FA and Direction Générale de the fact that when ji,e = ni, h(ji,e) = 1 (so 1  h(ji,e) vanishes). h
l’Armement.

Appendix A A.3. Proof of Corollary 5

Proof. Proof of Proposition 2 Proof. Proposition 3 gives


Using the theory of log-concave distributions, the proof of this 1 X
result is straightforward: the Bernoulli distribution is log-concave. 1  hT i ðji;e Þ e < NFAðeÞ:
#I i2I
Log-concavity is stable by convolution, so the binomial distribution
is also log-concave. A log-concave distribution has an increasing And Proposition 2 shows that, when ji,e < ni:
hazard function. But for the binomial distribution, this property is n
pi i pi
completely elementary. 1  hT i ðji;e Þ 6 1  hT i ðni  1Þ ¼ n 1 n
¼
Let us recall that hX denotes the hazard function, i.e., ni ð1  pi Þpi i þ pi i ð1  pi Þni þ pi

P½X ¼ i Summing over i ends the proof, since when ji,e = ni, 1  h(ji,e) =
hX ðiÞ ¼
P½X P i 0. h
for all i such that P½X P i > 0. Here, we will assume that X has a
binomial distribution with parameters p and n and simply denote References
hX by h.
The following equality is easy to derive: For i 2 [1, n] Almansa, A., Desolneux, A., Vamech, S., 2003. Vanishing point detection without any
  a priori information. IEEE Transactions on Pattern Analysis and Machine
1p i Intelligence.
hði  1Þ ¼ f  hðiÞ ð1Þ Attneave, F., 1954. Informational aspects of visual perception. Psychological Review
p niþ1
61, 183–193.
Bagnoli, M., Bergstrom, T., 2005. Log-concave probability and its applications.
where f : x 2 Rþ # xþ1
x
. Since h(n) = 1, one has h(n  1) 6 h(n). Since
Economic Theory 26 (2), 445–469.
i
f is increasing and niþ1 is an increasing function of i, the proof is Cao, F., Delon, J., Desolneux, A., Musé, P., Sur, F., 2007. A unified framework for
completed using the equality (1) by induction. h detecting groups and application to shape recognition. Journal of Mathematical
Imaging and Vision 27 (2), 91–119.
Delon, J., Desolneux, A., Lisani, J.-L., Petro, A.-B., 2007. A non parametric approach for
A.1. Proof of Proposition 3 histogram segmentation. IEEE Transactions on Image Processing 16 (1), 253–
261.
Desolneux, A., Moisan, L., Morel, J., 2001. Edge detection by Helmholtz principle.
Journal of Mathematical Imaging and Vision 14 (3), 271–284.
Proof. X denotes a binomial random variable with parameters n Desolneux, A., Moisan, L., Morel, J., 2003a. Computational gestalts and perception
and p and SX denotes its survival function: SX ðxÞ ¼ P½X P x. Ti thresholds. Journal of Physiology-Paris 97, 311–324.
R. Grompone von Gioi, J. Jakubowicz / Journal of Physiology - Paris 103 (2009) 4–17 17

Desolneux, A., Moisan, L., Morel, J., 2008. From Gestalt Theory to Image Analysis A Moisan, L., Stival, B., 2004. A probabilistic criterion to detect rigid point matches
Probabilistic Approach. Interdisciplinary Applied Mathematics, vol. 34. between two images and estimate the fundamental matrix. International
Springer. Journal of Computer Vision 57 (3), 201–218.
Desolneux, A., Moisan, L., Morel, J.-M., 2000. Meaningful alignments. International Musé, P., Sur, F., Cao, F., Gousseau, Y., Morel, J.-M., 2006. An a contrario decision
Journal of Computer Vision 40 (1), 7–23. method for shape elements recognition. International Journal of Computer
Desolneux, A., Moisan, L., Morel, J.-M., 2003b. A grouping principle and four Vision 69 (3), 295–315.
applications. IEEE Transactions on Pattern Analysis and Machine Intelligence. Robin, A., Moisan, L., Mascle-Le Hégarat, S., 2005. A multiscale multitemporal land
Elandt-Johnson, R., Johnson, N., 1980. Survival Models and Data Analysis. John cover classification method using a bayesian approach. In: Bruzzone, L. (Ed.),
Wiley & Sons. Image and Signal Processing for Remote Sensing XI, vol. 5982. SPIE.
Grenander, U., 1956. On the theory of mortality measurement, part II. Skandinavisk Shao, J., 2003. Mathematical Statistics. Springer.
Aktuarietidskrift 39, 125–153. Shapiro, S.S., Wilk, M.B., 1965. An analysis of variance test for normality. Biometrika
Grompone von Gioi, R., Jakubowicz, J., Morel, J.-M., Randall, G., 2008. On straight 52 (3), 591–611.
line segment detection. Journal of Mathematical Imaging and Vision 32 (3), Stanley, R.P., 1989. Log-concave and unimodal sequences in algebra, combinatorics,
313–347. and geometry. Annals of the New York Academy of Sciences 576 (1), 500–
Hochberg, Y., Tamhane, A., 1987. Multiple Comparison Procedures. John Wiley & 535.
Sons, New York. Veit, T., Cao, F., Bouthemy, P., 2006. An a contrario decision framework for region-
Igual, L., Preciozzi, J., Garrido, L., Almansa, A., Caselles, V., Rougé, B., 2007. Automatic based motion detection. International Journal on Computer Vision 68 (2), 163–
low baseline stereo in urban areas. Inverse Problems and Imaging 1 (2), 319– 178.
348. Wertheimer, M., 1923. Untersuchungen zur lehre von der gestalt. II. Psychologische
Marr, D., 1982. Vision. Freeman and co. Forschung 4, 301–350.

You might also like