You are on page 1of 2

STATISTICS AND RESEARCH DESIGN

Logistic regression: Part 1


Nikolaos Pandis, Associate Editor of Statistics and Research Design
Bern, Switzerland, and Corfu, Greece

I
n the article discussing the chi-square test,1 I used a alignment in patients in the control group, which we as-
clinical trial scenario with the objective of assessing sume here is the group with wire B (reference). In the
the clinical alignment efficiency of 2 types of wires. above equation, b is the log OR of reaching alignment
These wires (A and B) were used for 6 months in 2 patient in patients fitted with wire A vs patients fitted with wire B.
groups, and the outcome recorded was binary: reaching In a bit more detail, we have groups A and B, and the
complete alignment (success) or not reaching complete risk (proportion) of the event for A is p1, whereas the risk
alignment (failure). of the event for B is p2. The odds of the event would be
Table I shows the tabulation of alignment successes p1/(1p1) for the wire A group and p2/(1p2) for
and failures for each wire after 6 months of treatment the wire B group, and their natural logarithms would
and the calculation of risks and odds of success, and be log(p1/1p1) 5 logit(p1) and log(p2/
risk and odds ratios of alignment success vs failure. 1p2) 5 logit(p2), respectively. Then the OR of the
The chi-square test showed no evidence of a differ- event in group A compared with group B would be
ence in the success of alignment after 6 months between
pA=ð1  pAÞ
the 2 wire groups; the P value was 0.36. (equation 2)
The same result can be calculated using a special type of pB=ð1  pBÞ
regression analysis called logistic regression used when the and the logarithm of the OR would be
outcome is binary (alignment: yes/no). Remember linear      
pA=ð1  pAÞ pA pB
regression is used when the outcome is continuous (eg, mil- log 5 log  log
limeters of crowding alleviation). In logistic regression, we pB=ð1  pBÞ 1  pA 1  pB
can get effect estimates, P values, and confidence intervals
directly from the regression output. In logistic regressions, 5 logitðpAÞ  logitðpBÞ (equation 3)
the effect estimates are handled as log odds ratios (log OR,
log 5 natural logarithm), because they have appropriate If we use the values 0 and 1 for wires B and A, respec-
mathematical properties (can range from N to 1N). tively, and after appropriate substitutions in equation 1
We can convert the log ORs to odds ratios (ORs), which and using equations 2 and 3, we arrive at the following:
are more interpretable, by exponentiating them For wire B; logðp=1  pÞ 5 a1b  x 5 a1b  0 5 a or
(OR 5 exp[log OR]). Logistic regression has a similar form
as the linear regression model in the sense that components logðp=1  pÞ 5 a1b  x 5 a1b  0 5 a or
(y 5 a 1 bx) are linearly related in the logarithmic scale
(when using log ORs). However, in logistic regression, the  
pB
response or dependent variable y is the log odds log(p/ logðp=1  pÞ 5 log and
1  pB
1p), which is called the logit:
 
p for wire A; logðp=1  pÞ 5 a 1 b  x 5 a 1 b  1
log 5 a1b  x (equation 1)
1p
5 a 1 b and
where a is the intercept (constant), b is the regression co-
efficient of x, and x is the categorical predictor, with 2 in  
our example (wire A or wire B). pA
b 5 logðp=1  pÞ  a 5 log
Specifically, in the above equation that pertains to the 1  pA
   
logistic regression model, a is the log odds of reaching pB pA=ð1  pAÞ
 log 5 log
Department of Orthodontics and Dentofacial Orthopedics, School of Dental 1  pB pB=ð1  pBÞ
Medicine/Medical Faculty, University of Bern, Bern, Switzerland; private practice,
Corfu, Greece. Remember that b is the log OR, our estimate, of
Am J Orthod Dentofacial Orthop 2017;151:824-5 reaching alignment in patients fitted with wire A vs pa-
0889-5406/$36.00
Ó 2017 by the American Association of Orthodontists. All rights reserved. tients fitted with wire B; after exponentiation, this gives
http://dx.doi.org/10.1016/j.ajodo.2017.01.017 the OR.
824
Statistics and research design 825

Table I. Tabulation of alignment success and failure Table II. Logistic regression output for the effect of
after 6 months of treatment by wire type and calcula- wire type alignment on success after 6 months of
tion of risk and odds of success, and risk and odds ra- treatment
tios of alignment success vs failure
Predictor OR 95% CI P value
Wire type Wire type
B Reference
A B Total
A 1.66 0.56, 4.96 0.36
Alignment
Yes a 5 23 b 5 19 42 The application of logistic regression relies on the
No c58 d 5 11 19 same assumptions that we use to apply univariable
Total 31 30 61 and multivariable linear regression. Similar to the multi-
How many aligned with A?
variable linear regression, the logistic regression gives us
Risk 5 23/31 5 0.74 the flexibility to include more than 1 predictor
Odds 5 23/8 5 2.88
(compared with chi-square), both categorical and
How many aligned with B?
continuous interaction terms. Additionally, we can
Risk 5 19/30 5 0.63
obtain the estimates in the form of ORs (or log OR)
Odds 5 19/11 5 1.73
Risk ratio
and their corresponding 95% CIs, and not just a P value.
Table II gives the output after fitting a logistic regression
0.77/0.61 5 1.17
Odds ratio (OR) model to assess the association between wire type and align-
ment success. The dependent variable is binary: success or
2.88/1.73 5 1.66
failure of alignment, and the independent variable is the
We can easily arrive at an estimate if we substitute from wire type with 2 levels/groups/categories: wire A and wire B.
Table I that the We can see in Table II that we get similar results with the
    chi-square test, but here we also get automatically the ef-
pA=ð1  pAÞ 2:88 fect estimate (OR), the 95% CI, and the P value. The inter-
log 5 log 5 0:5096688
pB=ð1  pBÞ 1:73 pretation of the OR is as follows: the odds of reaching
The value 0.5096688 after exponentiation becomes alignment are 1.66 times higher or 66% higher with wire
1.66, which is the OR of reaching alignment in group A compared with wire B. It is incorrect to say probability
A vs group B. of success because probability is equivalent to risk, but
The calculation of the 95% confidence interval (CI) here we are dealing with odds. Refer also to previous arti-
for this OR is as follows.2 An approximation for the cles to refresh your memory on the differences between
95% CI for a risk ratio can be calculated as follows: risk, odds, risk ratio, and odds ratio.3,4 The estimates
lower CI bound 5 OR/EF range from 0.56 to 4.96. Here we are working with
upper CI bound 5 OR*EF ratios; hence, when the 95% CI includes the value of 1,
EF is the error factor, calculated as follows: we infer that there is no difference between the wires in
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ! terms of alignment success. If the odds of alignment
1 1 1 1 were the same in both wires, such as 0.70 and 0.70, then
EF 5 exp 1:96  1 1 1 the OR would be equal to 1. When we are working with
a b c b
differences, the value of zero indicates no difference. For
where a, b, c, and d refer to the number of events and logistic regression, the value of zero is the value of no
nonevents of alignment in wire groups A and B (Table I). difference if we are working on the natural log scale.
Using the example above, we can calculate the 95%
CI as follows: REFERENCES
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi!
1 1 1 1 1. Pandis N. The chi-square test. Am J Orthod Dentofacial Orthop
EF 5 exp 1:96  1 1 1 5 2:99 2016;150:898-9.
23 19 8 11
2. Kirkwood BR, Sterne JA. Essential medical statistics. 2nd ed. Oxford,
United Kingdom: Blackwell; 2003. p. 163-4.
lower value: risk ratio/EF 5 1.66/2.99 5 0.56
3. Pandis N. The effect size. Am J Orthod Dentofacial Orthop 2012;
upper value: risk ratio 3 EF 5 1.66*2.99 5 4.96 142:739-40.
We will give a logistic regression example using the 4. Pandis N. Risk ratio vs odds ratio. Am J Orthod Dentofacial Orthop
same data set that produced the tabulations in Table I. 2012;142:890-1.

American Journal of Orthodontics and Dentofacial Orthopedics April 2017  Vol 151  Issue 4

You might also like