Statistics and Research Design

STATISTICS AND RESEARCH DESIGN
Logistic regression: Part 1

Nikolaos Pandis, Associate Editor of Statistics and Research Design
Bern, Switzerland, and Corfu, Greece
I
n the article discussing the chi-square test,1 I used a alignment in patients in the control group, which we as-
clinical trial scenario with the objective of assessing sume here is the group with wire B (reference). In the
the clinical alignment efficiency of 2 types of wires. above equation, b is the log OR of reaching alignment
These wires (A and B) were used for 6 months in 2 patient in patients fitted with wire A vs patients fitted with wire B.
groups, and the outcome recorded was binary: reaching In a bit more detail, we have groups A and B, and the
complete alignment (success) or not reaching complete risk (proportion) of the event for A is p1, whereas the risk
alignment (failure). of the event for B is p2. The odds of the event would be
Table I shows the tabulation of alignment successes p1/(1p1) for the wire A group and p2/(1p2) for
and failures for each wire after 6 months of treatment the wire B group, and their natural logarithms would
and the calculation of risks and odds of success, and be log(p1/1p1) 5 logit(p1) and log(p2/
risk and odds ratios of alignment success vs failure. 1p2) 5 logit(p2), respectively. Then the OR of the
The chi-square test showed no evidence of a differ- event in group A compared with group B would be
ence in the success of alignment after 6 months between
pA=ð1 pAÞ
the 2 wire groups; the P value was 0.36. (equation 2)
The same result can be calculated using a special type of pB=ð1 pBÞ
regression analysis called logistic regression used when the and the logarithm of the OR would be
outcome is binary (alignment: yes/no). Remember linear
pA=ð1 pAÞ pA pB
regression is used when the outcome is continuous (eg, mil- log 5 log log
limeters of crowding alleviation). In logistic regression, we pB=ð1 pBÞ 1 pA 1 pB
can get effect estimates, P values, and confidence intervals
directly from the regression output. In logistic regressions, 5 logitðpAÞ logitðpBÞ (equation 3)
the effect estimates are handled as log odds ratios (log OR,
log 5 natural logarithm), because they have appropriate If we use the values 0 and 1 for wires B and A, respec-
mathematical properties (can range from N to 1N). tively, and after appropriate substitutions in equation 1
We can convert the log ORs to odds ratios (ORs), which and using equations 2 and 3, we arrive at the following:
are more interpretable, by exponentiating them For wire B; logðp=1 pÞ 5 a1b x 5 a1b 0 5 a or
(OR 5 exp[log OR]). Logistic regression has a similar form
as the linear regression model in the sense that components logðp=1 pÞ 5 a1b x 5 a1b 0 5 a or
(y 5 a 1 bx) are linearly related in the logarithmic scale
(when using log ORs). However, in logistic regression, the
pB
response or dependent variable y is the log odds log(p/ logðp=1 pÞ 5 log and
1 pB
1p), which is called the logit:

p for wire A; logðp=1 pÞ 5 a 1 b x 5 a 1 b 1
log 5 a1b x (equation 1)
1p
5 a 1 b and
where a is the intercept (constant), b is the regression co-
efficient of x, and x is the categorical predictor, with 2 in
our example (wire A or wire B). pA
b 5 logðp=1 pÞ a 5 log
Specifically, in the above equation that pertains to the 1 pA

logistic regression model, a is the log odds of reaching pB pA=ð1 pAÞ
log 5 log
Department of Orthodontics and Dentofacial Orthopedics, School of Dental 1 pB pB=ð1 pBÞ
Medicine/Medical Faculty, University of Bern, Bern, Switzerland; private practice,
Corfu, Greece. Remember that b is the log OR, our estimate, of
Am J Orthod Dentofacial Orthop 2017;151:824-5 reaching alignment in patients fitted with wire A vs pa-
0889-5406/$36.00
Ó 2017 by the American Association of Orthodontists. All rights reserved. tients fitted with wire B; after exponentiation, this gives
http://dx.doi.org/10.1016/j.ajodo.2017.01.017 the OR.
824
Statistics and research design 825
Table I. Tabulation of alignment success and failure Table II. Logistic regression output for the effect of
after 6 months of treatment by wire type and calcula- wire type alignment on success after 6 months of
tion of risk and odds of success, and risk and odds ra- treatment
tios of alignment success vs failure
Predictor OR 95% CI P value
Wire type Wire type
B Reference
A B Total
A 1.66 0.56, 4.96 0.36
Alignment
Yes a 5 23 b 5 19 42 The application of logistic regression relies on the
No c58 d 5 11 19 same assumptions that we use to apply univariable
Total 31 30 61 and multivariable linear regression. Similar to the multi-
How many aligned with A?
variable linear regression, the logistic regression gives us
Risk 5 23/31 5 0.74 the flexibility to include more than 1 predictor
Odds 5 23/8 5 2.88
(compared with chi-square), both categorical and
How many aligned with B?
continuous interaction terms. Additionally, we can
Risk 5 19/30 5 0.63
obtain the estimates in the form of ORs (or log OR)
Odds 5 19/11 5 1.73
Risk ratio
and their corresponding 95% CIs, and not just a P value.
Table II gives the output after fitting a logistic regression
0.77/0.61 5 1.17
Odds ratio (OR) model to assess the association between wire type and align-
ment success. The dependent variable is binary: success or
2.88/1.73 5 1.66
failure of alignment, and the independent variable is the
We can easily arrive at an estimate if we substitute from wire type with 2 levels/groups/categories: wire A and wire B.
Table I that the We can see in Table II that we get similar results with the
chi-square test, but here we also get automatically the ef-
pA=ð1 pAÞ 2:88 fect estimate (OR), the 95% CI, and the P value. The inter-
log 5 log 5 0:5096688
pB=ð1 pBÞ 1:73 pretation of the OR is as follows: the odds of reaching
The value 0.5096688 after exponentiation becomes alignment are 1.66 times higher or 66% higher with wire
1.66, which is the OR of reaching alignment in group A compared with wire B. It is incorrect to say probability
A vs group B. of success because probability is equivalent to risk, but
The calculation of the 95% confidence interval (CI) here we are dealing with odds. Refer also to previous arti-
for this OR is as follows.2 An approximation for the cles to refresh your memory on the differences between
95% CI for a risk ratio can be calculated as follows: risk, odds, risk ratio, and odds ratio.3,4 The estimates
lower CI bound 5 OR/EF range from 0.56 to 4.96. Here we are working with
upper CI bound 5 OR*EF ratios; hence, when the 95% CI includes the value of 1,
EF is the error factor, calculated as follows: we infer that there is no difference between the wires in
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ! terms of alignment success. If the odds of alignment
1 1 1 1 were the same in both wires, such as 0.70 and 0.70, then
EF 5 exp 1:96 1 1 1 the OR would be equal to 1. When we are working with
a b c b
differences, the value of zero indicates no difference. For
where a, b, c, and d refer to the number of events and logistic regression, the value of zero is the value of no
nonevents of alignment in wire groups A and B (Table I). difference if we are working on the natural log scale.
Using the example above, we can calculate the 95%
CI as follows: REFERENCES
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi!
1 1 1 1 1. Pandis N. The chi-square test. Am J Orthod Dentofacial Orthop
EF 5 exp 1:96 1 1 1 5 2:99 2016;150:898-9.
23 19 8 11
2. Kirkwood BR, Sterne JA. Essential medical statistics. 2nd ed. Oxford,
United Kingdom: Blackwell; 2003. p. 163-4.
lower value: risk ratio/EF 5 1.66/2.99 5 0.56
3. Pandis N. The effect size. Am J Orthod Dentofacial Orthop 2012;
upper value: risk ratio 3 EF 5 1.66*2.99 5 4.96 142:739-40.
We will give a logistic regression example using the 4. Pandis N. Risk ratio vs odds ratio. Am J Orthod Dentofacial Orthop
same data set that produced the tabulations in Table I. 2012;142:890-1.
American Journal of Orthodontics and Dentofacial Orthopedics April 2017 Vol 151 Issue 4

Statistics and Research Design

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics and Research Design

Uploaded by

Copyright:

Available Formats

STATISTICS AND RESEARCH DESIGN

Logistic regression: Part 1

You might also like