Probit and Logit

Welcome
ASSIGNMENT ON PROBIT AND LOGIT
Ag. Stat . 534 : Statistical Methods for Crop Protection - II
SPEAKER
SUBMITTED TO
Arjunsinh Rathava
Reg. No. - 1010121001 Dr. A. D. Kalola
PhD (2nd Sem.) Professor and Head
Dep. of Plant Pathology Department of Agricultural
BACA, AAU, Anand Statistics, AAU, Anand
3
Background
 The idea of probit analysis was originally published in Science by
Chester Ittner Bliss in 1934.
 He worked as an entomologist for the Connecticut agricultural
experiment station and was primarily concerned with finding an
Chester Ittner Bliss
effective pesticide to control insects that fed on grape leaves
(Greenberg 1980).
 By plotting the response of the insects to various concentrations of
pesticides, he could visually see that each pesticide affected the insects
at different concentrations, i.e. one was more effective than the other.
 However, he didn’t have a statistically sound method to compare this
difference.
4  The most logical approach would be to fit a regression of the response
versus the concentration or dose and compare between the different

pesticides.
 The relationship of response to dose was sigmoid in nature and at the time
regression was only used on linear data.
 Therefore, Bliss developed the idea of transforming the sigmoid dose-
response curve to a straight line. In 1952, a professor of statistics at the
University of Edinburgh by the name of David Finney took Bliss’ idea and
wrote a book called Probit Analysis (Finney 1952).
 Today, probit analysis is still the preferred statistical method in
understanding dose-response relationships.
5
 Probit analysis is a type of regression used to analyze binomial

response variables. It transforms the sigmoid dose-response curve
to a straight line that can then be analyzed by regression either
through least squares or maximum likelihood.
 Probit analysis can be conducted by one of three techniques:
1. Using tables to estimate the probits and fitting the relationship by
eye
2. Hand calculating the probits, regression coefficient, and
confidence intervals
3. Having a statistical package such as SPSS.
 The Basics Probit Analysis is a specialized regression model of binomial
6 response variables.
 Remember that regression is a method of fitting a line to your data to compare
the relationship of the response variable or dependent variable (Y) to the
independent variable (X).
Y=a+bX+e
Where,
a = y-intercept
b = the slope of the line
e = error term
 Also remember that a binomial response variable refers to a

response variable with only two outcomes.
For example:
• Flipping a coin: Heads or tails
• Testing beauty products: Rash/no rash
• The effectiveness or toxicity of pesticides: Death/no death
7 Applications
 Probit analysis is used to analyze many kinds of dose-response or
binomial response experiments in a variety of fields.
 It is commonly used in toxicology to determine the relative toxicity of
chemicals to living organisms.
 This is done by testing the response of an organism under various
concentrations of each of the chemicals and then comparing the
concentrations at which one encounters a response.
 As discussed above, the response is always binomial (e.g. death/no death)
and the relationship between the response and the various concentrations
is always sigmoid.
 Probit analysis acts as a transformation from sigmoid to linear and then
runs a regression on the relationship.
 Once a regression is run, the researcher can use the output of the probit
analysis to compare the amount of chemical required to create the same
response in each of the various chemicals.
8 Advantages of logit model
 Transformation of a dependent dichotomous dependent

variable into continuous variable
 Results - easily interpretable
 Simple to analyse method
 It gives parameter estimates- asymptotically consistent,
efficient and normal, so that the analogue by the
regression t-test can be applied.
9 Limitation
 As in case of logit probility model, the disturbance term

in logit model heteroscedasticity and therefore we
should go for weighted least squares.
 As in many other regression, there may be problem of
multicollinearity if the explanatory variable are related
among themselves
10 How does probit analysis work?
How to get from dose-response curve to an LC50?
 There are step by step guide to using probit analysis

with various methods.
 The easiest by far is to use a statistical package such as
SPSS, SAS, R or S, but it is good to see the history of
the methodology to get a thorough understanding of the
material.
Step 1: Convert % mortality to probits
(short for probability unit)
11
 Method A:
 Determine probits by looking up those corresponding to the % responded in
Finney’s table (Finney 1952):
 For example, for a 17% response, the corresponding probit would be

4.05. Additionally, for a 50% response (LC50), the corresponding probit
would be 5.00.
Method B: Hand calculations (Finney and Stevens 1948):

12
 The probit Y, of the proportion P is defined by:

 The standard method of analysis makes use of the maximum and minimum
working probits:
And the range 1/Z where

Method C: Computer software such as SPSS, SAS, R, or S convert the

per cent responded to probits automatically.
Step 2: Take the log of the concentrations
13
 This can either be done by hand if doing hand
calculations, or specify this action in the computer
program of choice.
 For example, after clicking Analyze, Regression, Probit,

choose the log of your choice to transform:
Step 3: Graph the probits versus the log of the
14 concentrations and fit a line of regression.
Note:
 Both least squares and maximum likelihood are acceptable techniques to
fitting the regression, but maximum likelihood is preferred because it
gives more precise estimation of necessary parameters for correct
evaluation of the results (Finney 1952).
Method A:
 Hand fit the line by eye that minimizes the space between the line and the
data (i.e. least squares).
 Although this method can be surprisingly accurate, calculating a
regression by hand or using computer program is obviously more
precise.
 In addition, hand calculations and computer programs can provide
confidence intervals.
Method B:
Hand calculate the linear regression by using the following
15
method (Finney and Stevens 1948):
 First set the proportion responding to be equal to p = r/n and the complement equal to q = 1-
p.
 The probits of a set value of p should be approximately linearly related to x, the measure of
the stimulus and a line fitted by eye may be used to give a corresponding set of expected
probits, Y.
 The working probit corresponding to each proportion is next calculated from either of the
following equations:
 Next a set of expected probits is then derived from the weighted linear regression equation
of working probits on x, each y being assigned a weight, nw, where the weighting
coefficient, w, is defined as:
 The process is repeated with the new set of Y values. The iteration converges to give you a
linear regression.
16 Method C: Use a computer program. SPSS uses maximum
likelihood to estimate the linear regression.
To run the probit analysis in SPSS, follow the following simple
steps:
 Simply input a minimum of three columns into the Data Editor
 Number of individuals per container that responded
 Total of individuals per container
 Concentrations
Screen 1:
After you columns are set, simply go to analyze, regression,
17
probit SPSS Data software:
 On screen, a_mort is the number of individuals that, a_total is the

total number of individuals per container and a_conc are the
concentrations.
 Row 7 in the following example is data from the control where 0 out
of 10 responded at a concentration of 0.
18 Screen 2:
Click
Click
Click
Screen 3:
19
 Then set your number responded column as the “Response Frequency”,

the total number per container as the “Total Observed”, and the
concentrations as the “Covariates”. Don’t forget to select the log base 10 to
transform your concentrations.
 Click on the Options…button to select the Statistics,
Natural Response Rate and the criteria for iterations.
20
Click on the Continue button.
21
 Click the OK button in the Probit Analysis dialog box

to run the analysis. The output will be displayed in a new
SPSS Viewer window.
Step 4: Find the LC50
Method A: Using your hand drawn graph, either created by eye or by
calculating the regression by hand, find the probit of 5 in
the y-axis, then move down to the x-axis and find the log
of the concentration associated with it. Then take the
inverse of the log, You have the LC50.

Method B: The LC50 is determined by searching the probit list for a probit of 5.00
and then taking the inverse log of the concentration it is associated
with.
Step 5: Determine the 95% confidence intervals:
23
 Method A: Hand calculate using the following equation:
 The standard error is approximately: ± 1/b √(Snw)
b = estimate of the slope of the line
Snw = summation of nw
w = weighted coefficient from table III = Z²/PQ Finney,
1952
 Method B: SPSS and other computer programs calculate this

automatically
Notes of Interest for Probit Analysis
24
 Probit analysis assumes that the relationship between number
responding (not per cent response) and concentration is normally
distributed.
 If data are not normally distributed, logit (more on this below) is

preferred.
 Must correct data if there is more than 10% mortality in the control
The method is to use the Schneider-Orelli’s (1947)
25 formula
% Responded – % Responded in Control
Corrected = ----------------------------------------------------------- x 100
100% - Responded % in Control
For example:
 Let’s say we have 20% mortality in the control and correcting the mortality
rate for the concentration where 60% occurred.
 Plug in the mortality rates into the equation above and We got mortality of
50% instead of the original 60%.
60% – 20%
Corrected = ---------------- x 100 = 50%
100% – 20%
26 Example: By using software
 How effective is a new pesticide at killing ants and what is an
appropriate concentration to use?
 You might perform an experiment in which you expose samples of

ants to different concentrations of the pesticide and then record the
number of ants killed and the number of ants exposed.
 Applying probit analysis to these data, you can determine the

strength of the relationship between concentration and killing and
you can determine what the appropriate concentration of pesticide
would be if you wanted to be sure to kill, say, 95% of exposed ants.
27 Logit Analysis
History
 The logit model was introduced by Joseph Berkson in 1944, who
coined the term.
 The term was borrowed by analogy from the very similar probit
Joseph Berkson model developed by Chester Ittner Bliss in 1934.
 G. A. Barnard in 1949 coined the commonly used term log-odds; the

log-odds of an event is the logit of the probability of the event.
 The logit function is the inverse of the sigmoidal "logistic" function
28
used in mathematics, especially in statistics.
 Log-odds and logit are synonyms.
 Logit Function
 This is called the logit function logit (Y) = log[O(Y)] = log[y/(1-y)]
Why would we want to do this?
 At first, this was computationally easier than working with normal
distributions.
 Now, it still has some nice properties that we’ll investigate next
time with multinomial dependent variables.
 The density function associated with it is very close to a standard
normal distribution
Definition
29  The logit of a number p between 0 and 1 is given by the formula:
 The base of the logarithm function used is of little importance given

here, as long as it is greater than 1, but the natural logarithm with base
e is the one most often used.
 The "logistic" function of any number is given by the inverse-logit:
 If p is a probability then p/(1 − p) is the corresponding odds and the

logit of the probability is the logarithm of the odds; similarly the
difference between the logits of two probabilities is the logarithm of
the odds ratio (R), thus providing a short hand for writing the correct
combination of odds ratios only by adding and subtracting:
30 Uses and properties
 The logit in logistic regression is a special case of a link
function in a generalized linear model: it is the canonical
link function for the binomial distribution.
 The logit function is the negative of the derivative of the
binary entropy function.
 The logit is also central to the probabilistic model for
measurement, which has applications in psychological
and educational assessment, among other areas.
 The inverse-logit function is also sometimes referred to as
the expit function.
31 Sample size: Logit model and Probit model
 The minimum number of cases per independent variable is 10

(Hosmer and Lemeshow, Applied Logistic Regression )
 For example If we are using 8 independent variables, then

minimum sample size should be = 8 x 10= 80
Logit vs Probit
32
Logit Probit
Slightly flatter tails The conditional probability Pi
approaches 0 or 1 at a faster
rate
Basis of logit model is standard Basis of probit model is
logistic distribution standard normal distribution
Variance = Π2 / 3 Variance = 1
Simple mathematics Sophisticated mathematics
 Both give same result, preference of the method depends on

the researcher choice but logit regression is mostly preferred
33 Conclusion
 Probit Analysis is a type of regression used with binomial response variables.
It is very similar to logit, but is preferred when data are normally distributed.
 Most common outcome of a dose-response experiment in which probit analysis
is used is the LC50 / LD50.
 Probit analysis can be done by eye, through hand calculations or by using a
statistical program.
 Researcher should be well aware of the different model and used according to
the defined research problem.
 Logit and probit model are being extensively used in health science, behavioral
and social sciences.
 Models are extensively used in social research when dependent variable is
dichotomous.
34
Thank
you…
35 References
Finney, D. J., Ed. (1952). Probit Analysis. Cambridge, England, Cambridge
University Press.
Finney, D. J. and W. L. Stevens (1948). "A table for the calculation of working
probits and weights in probit analysis." Biometrika 35(1-2): 191-201.
Greenberg, B. G. (1980). "Chester I. Bliss, 1899-1979." International Statistical
Review / Revue Internationale de Statistique 8(1): 135-136.
Hahn, E. D. and R. Soyer. (date unknown). "Probit and Logit Models: Differences
in a Multivariate Realm." Retrieved May 28, 2008, from
http://home.gwu.edu/~soyer/mv1h.pdf.

Probit and Logit

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probit and Logit

Uploaded by

Copyright:

Available Formats

Welcome

ASSIGNMENT ON PROBIT AND LOGIT

Ag. Stat . 534 : Statistical Methods for Crop Protection - II

versus the concentration or dose and compare between the different

 Probit analysis is a type of regression used to analyze binomial

 Also remember that a binomial response variable refers to a

 Transformation of a dependent dichotomous dependent

 As in case of logit probility model, the disturbance term

 There are step by step guide to using probit analysis

 For example, for a 17% response, the corresponding probit would be

And the range 1/Z where

Method C: Computer software such as SPSS, SAS, R, or S convert the

 For example, after clicking Analyze, Regression, Probit,

 Simply input a minimum of three columns into the Data Editor

 Number of individuals per container that responded

 Total of individuals per container

 On screen, a_mort is the number of individuals that, a_total is the

 Then set your number responded column as the “Response Frequency”,

 Click the OK button in the Probit Analysis dialog box

 Method B: SPSS and other computer programs calculate this

 If data are not normally distributed, logit (more on this below) is

 You might perform an experiment in which you expose samples of

 Applying probit analysis to these data, you can determine the

Joseph Berkson model developed by Chester Ittner Bliss in 1934.

 G. A. Barnard in 1949 coined the commonly used term log-odds; the

 The base of the logarithm function used is of little importance given

 If p is a probability then p/(1 − p) is the corresponding odds and the

 The minimum number of cases per independent variable is 10

 For example If we are using 8 independent variables, then

 Both give same result, preference of the method depends on

You might also like