Statistical guide as to the steps in proving causality.

10.3.2008

How to Prove Causality

causality of two variables. A person may assert that the height of a person determines

how fast they run. This is a perfectly acceptable assertion to make; however, it has to be

affirmed by statistical analysis.

There are four criteria that have to be met in order to prove causality:

1. Association

2. Prediction

3. Excluding Alternative Hypotheses

4. Dose Dependence

• Association

When comparing two variables the best way of doing so is via Regression

(http://en.wikipedia.org/wiki/Regression_analysis).

Regression is the first step in proving causality. It represents an association between two

items on a linear scale. A regression analysis is depicted visually below.

If regression analysis exhibits some sort of association between the two variables within a

certain statistical variance, then one can fairly claim that there is a relationship between

the two items (a correlation).

In the behavioral sciences, in order for the sample size to be large enough to prove a

statistical relationship, the following chart is used, developed by Cohen & Cohen in

“Applied Multiple Regression…” 2nd edition, 1983 pg. 530

10.3.2008

Desired Population r [Effect Size Expressed as r]

power .10 .20 .30 .40 .50 .60 .70 .80 .90

.25 166 42 20 12 8 6 5 4 3

.50 384 95 42 24 15 10 7 6 4

.60 489 121 53 29 18 12 9 6 5

2/3 570 141 62 34 21 14 10 7 5

.75 692 171 74 41 25 17 11 8 6

.80 783 193 84 46 28 18 12 9 6

.85 895 221 96 52 32 21 14 10 6

.95 1308 322 139 75 46 30 19 13 8

.99 1828 449 194 104 63 40 27 18 11

desired power of .80 (row) is an acceptable statistical measure. The reason we look to .30

is because this is a generally accepted rate for medium-size effects. The reason we look to

the .80 is due to statistical power. The .80 represents he likelihood that we are going to

actually see a statistically significant result if a significant relationship does indeed exist.

The minimum amount of samples necessary to achieve this statistical significance is 84

samples.

• Prediction

The second step to proving causality is prediction. Prediction entails making a logical

assumption as to how events will transpire and then testing for it. Therefore, the

assumption could be made that tall people will run faster on the basis that they have

longer legs, have a higher metabolic rate, have stronger muscles, etc. Being able to

reliably predict something is crucial to causality.

(http://en.wikipedia.org/wiki/Confounding_variable). For example, assume that a child's

weight and a country's gross domestic product rise with time. A person carrying out an

experiment could measure weight and GDP, and conclude that a higher GDP causes

children to gain weight, or that children's weight gain boosts the GDP. However, the

confounding variable, time, was not accounted for, and is the real cause of both rises

(Wikipedia).

Each relevant alternative hypothesis has to be accounted for and negated in order for your

proposition or assertion to explain the phenomenon being described. Your assertion needs

to go above and beyond all other variables in order for it to be considered causal.

• Dose Dependence

Dose dependence is the final and most critical element in proving causality. Essentially,

dose dependence states that, “It is in reference to the effects of treatment with a drug. If

the effects change when the dose of the drug is changed, the effects are said to be dose-

dependent” (NCI).

Above is the dosage plan for a child when taking Tylenol. Tylenol has undergone several

studies and has been shown to be dependent on the weight of the person using it. For the

same effect to be felt by a child who weighs 95 lbs as a child who weighs 35 lbs, the child

has to take 3 teaspoons and 1 teaspoon, respectively.

Dose dependence indicates a direct, measurable causation of the impact of one stimulus

on another. If this is proven, along with the aforementioned criteria, you can state that a

particular variable has a causal relationship with another variable.

