Professional Documents
Culture Documents
From causal inference we can know about the effects of treatment, policies or intervention.
SIMPSON'S PARADOX
Let's take a hypothetical situation.
We have the data on what happens after a treatment is given, condition of each patient.
Here all the variables are binary, though it can be extended to continuous in the latter stage.
My aim is to reduce the number of deaths in the country. So which treatment will lower the
number of deaths?
------
By just looking at this picture, we can say that treatment A is performing better than B
because the percentage of people dying is less in A, as compared to that of B.
----
However, if we subgroup the data by conditions, then treatment B shows a lower mortality
rate for different conditions: mild and severe. This is known as Simpson's Paradox.
---
In 16%, the largest weight comes from the mild group, i.e. (1400/1500). Whereas in the 19%,
the largest weight is derived from the severe group (500/550).
Simpson's Paradox comes from this unequal weighting. Moreover, the large weightage in
treatment B comes from severe conditions, which makes it apparent that people with severe
conditions are more likely to die than those with mild conditions.
So the question still remains, which treatment is more effective? The answer lies in the
causal structure. There are 2 scenarios:
● Condition as a cause of treatment
● Treatment is a cause of condition
---
From the diagram (1st scenario) we can see that Condition is the cause, its effects are
Treatment, Outcome.
Moreover, Treatment is the cause and Outcome is the effect.
Here treatment B is a better choice because the doctor is trying to keep the scarce treatment
for severe cases.
Similarly for the 2nd scenario, Treatment is the cause of Condition and outcome.
And Condition is also the cause of outcome. Since B is scarce, people with mild conditions
can become severe over time, while waiting for treatment B. So A is a better choice here.
CORRELATION DOESN'T IMPLY CAUSATION
Confounding association is running between shoe sleeping and waking up with a headache,
where drinking is the confounder.
This is different from causal association, which would be that shoe sleeping is causing
headache after we wake up. It's a sorf of direct relationship.
So from the above situations, how do we actually know if the pill is causing the headache to
go away?
A fundamental problem of causal inference is: if we take the pill, we cannot observe the
outcome of not taking the pill.
Or, if we don't take the pill, we cannot observe the outcome of taking the pill. So we cannot
compute the causal effect because we have access to only one of the terms, and not both.
Randomisation is a statistical procedure by which participants are allocated into two groups.
Randomisation eliminates selection bias (In Selection bias the sample is not a representative
of the population because every individual didn't have the same chance of getting selected in
any of the groups.)