Professional Documents
Culture Documents
The Encyclopedia of Clinical Psychology - 2014 - Kraemer
The Encyclopedia of Clinical Psychology - 2014 - Kraemer
The Encyclopedia of Clinical Psychology, First Edition. Edited by Robin L. Cautin and Scott O. Lilienfeld.
© 2015 John Wiley & Sons, Inc. Published 2015 by John Wiley & Sons, Inc.
DOI: 10.1002/9781118625392.wbecp048
10.1002/9781118625392.wbecp048, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/9781118625392.wbecp048 by Cochrane Philippines, Wiley Online Library on [13/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2 EFFECT SIZE
two-sample t test is valid, so is Cohen’s d as an selected subject in the second group (G2),
effect size. the subject in the first group has a bigger
There are variations on this theme. Instead of (or better) response on Y and “G1 = G2”
dividing by the estimate of the common stan- indicates equality of responses. If the normal
dard deviation, one might divide by the stan- distribution underlying Cohen’s d holds, then
√
dard deviation of one of the groups (e.g., the AUC = Φ(d/ 2), where Φ() is the cumulative
control group in a randomized clinical trial) or standard normal distribution function and d
by the square root of the average variance. The is either Cohen’s d (using the pooled variance
magnitude of Cohen’s d can be visualized as the in the denominator) or d using the average of
overlap between two standard normal distribu- the two groups’ variances in the denominator.
tions but it is a challenge to visualize what such The AUC is invariant under all monotonic
variations in Cohen’s d represent. transformations and can be used with any Y
The Mean Difference (including binary, three-, four- and five-point
Cohen’s d is the standardized mean difference scales, and continua with highly skewed or
between two groups. Why standardize, partic- long- tailed distributions). Perhaps its only
ularly if the scale used is a familiar one? The weakness is that its null value is 0.5, rather than
problem is that if the two group means were 0, but that is easily corrected by using instead
to differ by 10 units, and the within-group SRD (success rate difference) = 2AUC − 1. The
standard deviation were 1, there would be SRD shares all the qualities of AUC, but now
no overlap between the distributions, and no the null value is zero, and the two extremes
doubt of the clinical significance of such a where the two populations do not overlap at all
finding; however, if the within-group stan- are +1 and −1.
dard deviation were 100, the overlap between
Number Needed to Treat/Take (NNT)
the distributions would be almost complete,
and the clinical significance very doubtful. Clinicians, patients and policy makers often
Thus ignoring the standard deviation leaves have problems interpreting probability points
the question of clinical/practical significance (as in AUC or SRD). The NNT instead reports
completely in doubt. on number of subjects, a scale sometimes eas-
ier to interpret. Suppose one defines a subject
Area under the ROC Curve (AUC) as a “success” if that subject had a response
There are many situations in which the variable (Y) bigger (better) than a randomly chosen
Y does not have a normal distribution in one of subject in the other group. How many subjects
the groups, or where the variances are unequal. would one need to sample from G1 to have
Cohen’s d is quite robust to minor deviations one more “success” than if one sampled the
from its underlying assumptions, but using same number from G2? Answer: NNT, which
it with three-, four- or five-point scales, or is equal to 1/SRD. The NNT is most familiar
with continua with outliers, or unequal vari- when Y is success/failure, in which case SRD =
ances, can be quite misleading. The AUC, p1 − p2, the difference in the probabilities of
representing the area under the receiver oper-
success in the two groups (hence “success rate
ating characteristic curve (ROC curve), is one
difference”). However, ease of interpretation
alternative. For present purposes:
here translates to confusing mathematical
AUC = probability (G1 > G2) properties. For example, NNT is undefined
when SRD = 0 (i.e., when there is no difference
+.5 probability (G1 = G2),
between G1 and G2). Generally, therefore, for
where, for example, “G1 > G2” means that research purposes, SRD is preferred to NNT
if one compared a randomly selected sub- but one can always translate a SRD to NNT for
ject in the first group (G1) with a randomly clear communication purposes.
10.1002/9781118625392.wbecp048, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/9781118625392.wbecp048 by Cochrane Philippines, Wiley Online Library on [13/08/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
EFFECT SIZE 3