Professional Documents
Culture Documents
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
Royal Statistical Society and Wiley are collaborating with JSTOR to digitize, preserve and
extend access to Journal of the Royal Statistical Society. Series B (Methodological)
By MERVYN J. SILVAPULLE
Australian National University, Canberra
SUMMARY
Necessary and sufficient conditions are given for the existence of maximum likelihood
estimators of the linear regression parameter in binomial response (this includes Logit and
Probit) models.
1. INTRODUCTION
THE question of existence of maximum likelihood estimators (mle) for Logit models arose in an
analysis of the relationship of psychiatric "caseness" to scores on a psychiatric screening
questionnaire. Tennant (1977) administered the GHQ (General Health Questionnaire,
Goldberg, 1972) to 120 patients attending a General Practitioner's surgery, and also gave each
one a standardized psychiatric interview. From the interview, patients were classified as
Psychiatric Case/Non-case. In a secondary analysis of Tennant's data, Duncan-Jones and
Henderson (1978) fitted a Logit regression of"caseness" on GHQ Score, and obtained a good fit
with the model.
Logit { Prob (Case)} = 31+ /2 X, (1)
where x = GHQ Score and Logit (t) =log {t/(1 - t)}, for the full set of data. In a more detailed
analysis, Duncan-Jones encountered problems in attempting to fit a separate Logit regression
for males (Table 1), though the data for females gave a satisfactory fit. For illustration, we have
used the 12-item version of the GHQ Score. The same problem arose in these data when using
the 30-item version.
TABLE 1
Number of patients classified by the GHQ score and the outcome of a standardized psychiatric
interview (Case/Non-case)
Males Cases 0 0 1 0 1 3 0 2 0 0 1 0 0 8
Non-cases 18 8 1 0 0 0 0 0 0 0 0 0 0 27
Females Cases 2 2 4 3 2 3 1 1 3 1 0 0 0 22
Non-cases 42 14 5 1 1 0 0 0 0 0 0 0 0 63
where G is a distribution function, P - (/4,..., Ap) and xi P is the usual inner product
xi, 1 +... +xipP. For convenience, we shall assume that yi = 1, i = 1,...,r and yi = 0,
= r + 1, ...,n for some 0 < r < ii. Writing /(p) for -log (likelihood), we have
r n
1(p) =-E
1
log
r+
G(xi
1
P)- Y log { 1-G(xi P})J (3)
(i) The mle J of P exists and the minimum set {j} is bounded only when fl is satisfi
(ii) Suppose that /(p) is a proper closed convex function on RP. Then the mle P exists and
minimum set {t} is bounded if and only if H is satisfied.
(iii) Suppose that - log G and log (1 - G) are convex and xi1 = 1 for every i. Then j ex
and the minimum set {I } is bounded if and only if S n F 0. Let us further assume th
G is strictly increasing at every t satisfying 0 < G(t) < 1. Then j is uniquely defined if
only if SnF#0.
As an application, let us consider the Logit (G = Logistic) and Probit (G = Normal) models.
Maximum likelihood estimation in these important models are discussed in Cox (1970) and
Finney (1971) respectively. It may be verified directly by evaluating the second derivatives that
- log G and - log { 1 - G} are convex. Therefore, assuming that the response model includes a
constant term it follows from part (iii) of the above theorem that the mle P is uniquely defined if
and only if S n F =A 0.
McFadden (1976) refers to the cases G = Cauchy and G = Uniform (0, 1). When G is Cauchy
- log G and - log { 1 - G} are not convex. Therefore, l(p) is not convex in general and it may
have multiple minima. It seems that this problem will arise whenever G has tails heavier than
that of Logistic distribution. For instance, suppose that G(t) 1 I t I - l as t - c - c for some q > 0.
Then (d2/dt2) { - log G(t)} -" t-2 for large negative t. Therefore, - log G is not convex. Now,
let us consider the case when G = Uniform (0, 1) and x1 1 for every i. Clearly, - log G and
- log { 1 - G} are convex and G is strictly increasing on (0, 1). Therefore, (by part (iii) of the above
theorem) we conclude that the mle i exists uniquely if and only if S rn F =# 0.
For most practical purposes the third part of the above theorem is sufficient. To give a simple
illustration of the general ideas involved in the main theorem, let us consider the set of data in
Table 1 for males. The model in consideration is (1) which is the same as (2) with G the Logistic
distribution and xi = (1, (GHQ Score)i). The convex cones S and F are shown in Fig. 1. Since F is
open (relative to the vector space spanned by all the xis corresponding to Non Cases) it is the
cone which lies between (not including) OA and OB. Similarly, S does not include OB and OC.
Clearly S and F are disjoint and are separated by the vector (1, 2). Hence, by the theorem, j does
x2
ilc
e(1, 1)
//---',
e S 9 , (1 ,2)
F
0 (1, 0) A Xi
not exist. Note that the vector e = (-2, 1), which is orthogonal to (1, 2) is such that xi e O for
Non-cases and xi e 0 for Cases, and so, from (3), l(f + ke) is decreasing in k for any p.
By contrast, if there is an additional observation which is either a Non-case with GHQ > 2 or
a Case with GHQ < 2, then S rq F is no longer 0, and so, by the theorem, , exists. In this event
there is no vector separating S and F and thus no vector corresponding to e above. Therefore, (3)
implies that for any P and e, l(P + ke) increases in k for large k, that is l(p) increases in any
direction eventually. Now, since l(p) is convex it is intuitively clear that it must have a minimum.
A figure similar to Fig. 1 for the females shows that S r- F # 0 and the mle Pi exists. Let us remark
here that if all the Non-cases correspond to xi = (1, 2) then F is the line OB, not the empty set.
ACKNOWLEDGEMENT
I am grateful to Professor C. R. Heathcote, Mr P. Duncan-Jones and the referees for their
useful comments. Also, I am grateful to Dr C. Tennant for allowing me to use his data; the full
data have not been published previously and the data in Table 1 have been made available by
Mr P. Duncan-Jones.
APPENDIX
In this Appendix we shall explain most of the technical terms (not in their full generalities) in
Convex Analysis that are used in this paper. For precise definitions the reader is referred to
Rockafellar (1972). However we believe that this Appendix should be sufficient to understand
the essential points.
A convex cone C in RP is a convex set such that kx E C whenever x c C and k > 0. The convex
cone SO generated by Xj,...,Xr is {klx,+... +kp xpIki . The relative interior S of SO is the
interior of SO with respect to {x - y I x, ye So which is the sub-vector space spanned by S0.
Let C, and C2 be non-empty sets in RP and H = {x I xe = 0} be a hyperplane. The two closed
halfspaces associated with H are defined as {x xeO} and {x xeO}. We say that H separates
C, and C2 if Cl is contained in one closed half space and C2 is contained in the oth
addition, if C1 u C2 is not contained in H, then H is said to separate C1 and C2 properly.
Letfbe a convex function on RP. We say thatfis proper if it is finite on a non-empty convex
set C and takes the values + oo outside C. A proper convex function is closed if it is lower semi-
continuous. The minimum set {x} offis {x E RP f(I) = inff(y), where the infimum is taken over
y E RP}. A direction of recession of a proper closed convex function f is a unit vector e E RP for
which there exists x E RP such that f(x) < oo and f(x + ke} is a non-increasing function of k for
large k.
REFERENCES
Cox, D. R. (1970). Analysis of Binary Data. London: Chapman and Hall.
DUNCAN-JONES, P. and HENDERSON, A. S. (1978). The use of a two-phase design in a population survey. Social
Psychiatry, 13, 231-237.
FINNEY, D. J. (1971). Probit Analysis, 3rd ed. Cambridge: Cambridge University Press.
GOLDBERG, D. P. (1972). The Detectioni of Psychiatric Illness by Questionnaire. (Institute of Psychiatry Maudsley
Monographs, No. 21) London: Oxford University Press.
McFADDEN, D. (1976). Quantal choice analysis, a survey. Atnn1. Econ. Soc. Meas., 4, 363-390.
ROCKAFELLAR, R. T. (1972). Convex Analysis. Princeton, N.J.: Princeton University Press.
TENNANT, C. (1977). The general health questionnaire: a valid index of psychological impairment in Australian
populations. Med. J. Aust., 2, 392-394.