Stats For Non-Staticians

Statistics for Non-Statisticians
THE BASIC IDEA
Statistics are used in clinical trials to make inferences about new treatments based on the evidence of the patients in the trial
E.g. New drug for treatment of lung cancer does it work or not?
Ideally design trial that includes all patients with lung cancer
Not really practical!!
Can only test the new treatment on a representative sample of the population Statistics allow us to draw conclusions about the likely effect on the population using data from the sample
USING STATISTICS
But what exactly do we want the statistics to assess: Assess
the weight of evidence that a treatment works (or doesnt) Give an estimate (and likely range) of the treatment effect Test to see how likely it is that this effect would have been seen by chance
BUT
Statistics can never PROVE anything beyond any doubt, just beyond reasonable doubt!!
STATISTICAL DATA ANALYSIS METHODS
WHO TO INCLUDE?
In any clinical trial, one is likely to find:

Ineligible patients included by mistake Protocol violators those who dont adhere to the treatment regimen allocated Patients who withdraw or get lost to follow-up
To avoid bias, keep these to a minimum
Follow-up all patients randomised into a trial
Should we include them in the analysis?
INTENTION TO TREAT ANALYSIS
As a general rule, all patients randomised should be analysed by treatment allocated (regardless of whether they actually received this treatment) INTENTION TO TREAT ANALYSIS Reasons for ITT:
Avoids or certainly minimises risk of bias Is more pragmatic reflects real life
HYPOTHESIS TESTING
We want to compare the outcomes in different treatment arms (A and B) Testing two hypotheses

H0: A=B H1: AB
(Null hypothesis no difference)
Calculate test statistic based on the assumption that H0 is true (i.e. there is no real difference) Test will give us a p-value: how likely are the collected data if H0 is true If this is unlikely (small p-value), we reject H0
THE LURE OF THE P-VALUE
The p-value is the probability of having observed our data when the null hypothesis is true
Typically if the p-value is less than 0.05, people say that the trial gives statistically significant evidence that there is a difference Tend to ignore results where p-value greater than 0.05 However, 0.05 is a purely arbitrary value, and not really that small one time in twenty we will reject H0 wrongly!
That is state difference exists, when one doesnt (false positive)
Dont become wedded to the p-value: there is not much difference between 0.051 and 0.049
ESTIMATE OF TREATMENT EFFECT
Better still, use the data collected in the trial to give an estimate of the treatment effect size, together with a measure of how certain we are of our estimate
CONFIDENCE INTERVALS (CI)
To determine the true treatment effect, we calculate the confidence interval for our point estimate CI is a range of values within which the true treatment effect is believed to be found, with a given level of confidence. 95% CI is a range of values within which the true treatment effect will lie 95% of the time Generally, 95% CI is calculated as Sample Estimate 1.96 x Standard Error Use the confidence interval to assess the true treatment effect, and not just p-values
DATA ANALYSIS

How do we do this? What type of analysis should be performed? Depending on the sort of outcome measure, different types of analysis are appropriate Because the actual analyses are now done mainly by computer, the skill is now:
In choosing the appropriate test Correctly interpreting the results
COMMON OUTCOME MEASURES
Categorical Continuous Survival
CATEGORICAL DATA
Outcomes like good/bad, yes/no or present/absent In testing categorical data, we are looking to see if there is any relationship between the outcome category and the treatment given

H0: No association between variables H1: Association between variables
For categorical data, the chi-squared test is appropriate if the categories arent ordered For ordered categories, use a trend test
ISIS TRIAL OF ASPIRIN TO PREVENT MORTALITY AFTER MI

Dead 804 1820 Alive 7783 Total 8587
Aspirin Total
No Aspirin 1016
7584
15,367
8600
17,187
- Use chi-squared test of association to determine whether to reject the null hypothesis of no association between aspirin and death
ISIS TRIAL OF ASPIRIN TO PREVENT MORTALITY AFTER MI

Dead Alive Total 804 (E=909.3) 7783 (E=7677.7) 8587 1820 15,367 17,187
Aspirin Total
No Aspirin 1016 (E=910.7) 7584 (E=7689.3) 8600
- Use chi-squared test of association to determine whether to reject the null hypothesis of no association between aspirin and death
- X21 = (804 909.3)2 / 909.3 + + (7584 7689.3)2 / 7689.3 = 27.26 - X21 = 27.26 (P<0.0001)
- Strong evidence of an association between aspirin and mortality
MEASURES OF TREATMENT EFFECT
Tested hypothesis and found strong evidence of an association between aspirin use and mortality Not very informative - is aspirin harmful or beneficial?
Various measures of treatment effect:
Absolute Risk Reduction Number Needed to Treat Relative Risk Relative Risk Reduction Odds Ratio Odds Reduction
ODDS RATIO & ODDS REDUCTION
Odds ratio = (804 x 7584) / (7783 x 1016) = 0.77

<1 so odds of dying smaller with aspirin 95% CI for the odds ratio = 0.70 to 0.85
Estimate of treatment effect
Odds reduction = 23%
With true treatment effect based on CI ranging from a 15% reduction in mortality to a 30% reduction in mortality with aspirin

Moderate treatment effect, narrow-ish CI and P<0.0001 Good evidence that aspirin reduces risk of death following MI
SUMMARISING BINARY DATA IN TWO GROUP PROSPECTIVE STUDY

Risk in standard treatment (P1) and Risk in new treatment (P2)
Term Absolute Risk Reduction (ARR) Formula P1 - P2 ISIS Example 0.118 0.094 = 0.024 (i.e. 2.4% in favour of new Rx) 1 / 0.024 = 41.7, so NNT = 42 (i.e. need to treat 42 patients with aspirin in order to prevent 1 death) 0.094 / 0.118 = 0.80 (<1) (i.e. risk of death lower with aspirin)
Number needed to 1 / |P1 - P2| treat/harm (NNT/NNH) Relative Risk (RR) Relative Risk Reduction (RRR) P2 / P1
(P1 - P2) / P1 (0.118 0.094) / 0.118 = 0.20 (i.e. aspirin reduces the risk of death by 20%)
CONTINUOUS DATA
Outcomes like blood pressure, weight or scores, summarised using measures of the centre and spread of the distribution
Measures of the centre of the distribution

Mean: what we think of as an average add up all data and divide by number of items Median: midpoint of the data half data below median, and other half above Mode: most popular observation
Measures of spread

Variance and standard deviation Standard deviation is average distance individual observations are from the mean
CONTINUOUS DATA
In continuous data, we are comparing the means in the two groups and assessing whether the two groups come from the same population
H0: Mean A = Mean B H1: Mean A Mean B
Use Students t-test
ANOVA if comparing >2 treatment groups
NORMAL DISTRIBUTION

T-test and ANOVA assumes data are Normally distributed However, if the data are very skew or have multiple peaks, we use a non-parametric test which doesnt assume any particular shape for the data
Wilcoxon Mann-Whitney
As a rule, non-parametric tests are more general, but less sensitive
STUDY COMPARING TWO ANTIHYPERTENSIVE DRUGS ON BP

Diastolic BP compared in two groups of hypertensive patients given two different drug treatments
Treatment A
N 41 43
Mean 91mmHg 95mmHg
SD 5.5 5.5
Treatment B
- Use Students t-test to assess whether means are from the same population (i.e. Mean with Treatment A = Mean with Treatment B)
TESTING FOR A DIFFERENCE

Treatment A: N=41, Mean=91mmHg, SD=5.5 Treatment B: N=43, Mean=95mmHg, SD=5.5

Use t-test to assess evidence for or against null hypothesis (mean A = mean B) t-test = -3.33 on 82 df (df=n1+n2-2) P=0.0013 So there is evidence against H0 Evidence that the mean diastolic BP in the two treatment groups are different
MEASURE OF TREATMENT EFFECT

Tested hypothesis and found evidence that mean diastolic BP in two groups are different Not very informative which of treatment A or B is better?
Point estimate of the treatment effect - calculate the difference between the two means and the confidence interval

Difference = 91 95mmHg = -4mmHg (favours treatment A) 95% CI: -6.39 to -1.61mmHg
So the difference in mean diastolic BP between groups is statistically significant (P=0.0013) With treatment A being more effective in reducing diastolic BP However, the observed difference of 4mmHg in favour of treatment A, could be as small as 1.6mmHg or as large as 6.4mmHg.
SURVIVAL DATA
Why are survival data different?

Interested in studying the time between randomisation and a subsequent event (say death) These times are unlikely to be normally distributed Cannot afford to wait until events have happened to all subjects, for example until all are dead. Some people may have left the study early and become lost to follow up - only information we have about some patients is that they were still alive at last follow-up.
Use survival analysis methods to analyse time to event data, not just the number of events
Take into account that not all patients may have had an event
KAPLAN-MEIER SURVIVAL ANALYSIS
Basic idea: we split the trial up into distinct time intervals In each time interval: a certain number, N, patients enter that time period alive and still on follow-up, and some of these, D, have an event: Then the probability of surviving that time interval (assuming you live that long) is (1-D/N) Multiply all these probabilities together to give the probability of survival up to a given time point
EXAMPLE SURVIVAL FUNCTION

Survival Function
1.0
.8
.6
.4
.2 Survival Function 0.0 0 20 40 60 80 100 120 Censored
Time in W eeks
AIM-HIGH TRIAL OF INTERFERON FOR MALIGNANT MELANOMA

Dead Alive Median Survival ~4 years ~4 years Total
IFN Total
151 307
187
180
338
336
No IFN 156
367
674
Want to assess whether the time to death is the same for the two treatments?
COMPARING SURVIVAL BETWEEN GROUPS
We will have two graphs: how do we say whether one group survives longer than the other?
Could do one test at say 1 year; compare proportions (as before) Could keep testing at small intervals
What are the drawbacks to these methods? Use logrank test to determine whether survival function the same for two treatment groups

H0: Survival function/curve same for both groups H1: Survival function/curve different across groups
SURVIVAL IN MELANOMA: INTERFERON VS. OBSERVATION
MEASURE OF TREATMENT EFFECT

Assessed the evidence and found that there is no evidence that time to death differs between the treatment groups Despite lack of difference should still calculate point estimate and confidence interval for treatment effect
Use cox regression to calculate hazard ratio and confidence interval
HR=0.94 (CI=0.75 1.18)
IFN non-significantly reduces the risk of death by 6%, with the true treatment effect based on the confidence interval ranging from a 25% reduction in mortality to an adverse 18% increase in mortality with IFN.
ANALYSES GOOD PRACTICE
Report the primary/secondary outcomes as stated in the protocol
Dont give minor endpoints undue prominence in the paper
Do not explore all endpoints until you find one that is significant (data dredging)

Looking at multiple outcomes, increases chance of finding something significant In 20 outcomes, just by chance 1 outcome will be significant Is this real, or the play of chance?
Solution: Dont have too many endpoints
ANALYSES GOOD PRACTICE
Give confidence intervals where possible, and not just p-values Keep subgroup analyses to a minimum
Subgroup analyses should be pre-specified When interpreting subgroups, assess whole picture Do not focus upon one subgroup and individual p-values
FINAL WORDS
The idea of statistics is to look at the strength of the evidence for a given hypothesis and determine the reliability of the treatment effect observed in the trial Calculations are based on formulas, but the application of the formulas and the interpretation of the results is an art rather than a science Significance is not black and white
P>0.05 is not evidence of absence of effect, merely absence of evidence of an effect
A little common sense can go a long way in medical statistics If in doubt, ask a statistician!
To call in the statistician after the experiment is done may be no more than asking him to perform a post mortem examination: he may be able to say what the experiment died of.
Sir R.A. Fisher
Indian Statistical Congress, Sankhya, c. 1938
BOOK LIST
Swinscow TDV and Campbell MJ. Statistics at Square One (10th edition). BMJ Books 2002
Campbell MJ. Statistics at Square Two. BMJ Books 2001 Altman D, Machin D, Bryant T and Gardner M. Statistics with Confidence. BMJ Books 2000 Pereira-Maxwell F. A-Z of Medical Statistics. Arnold1998

Stats For Non-Staticians

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stats For Non-Staticians

Uploaded by

Copyright:

Available Formats

Statistics for Non-Statisticians

THE BASIC IDEA

Not really practical!!

But what exactly do we want the statistics to assess: Assess

STATISTICAL DATA ANALYSIS METHODS

In any clinical trial, one is likely to find:

To avoid bias, keep these to a minimum

Follow-up all patients randomised into a trial

Should we include them in the analysis?

INTENTION TO TREAT ANALYSIS

H0: A=B H1: AB

(Null hypothesis no difference)

THE LURE OF THE P-VALUE

That is state difference exists, when one doesnt (false positive)

ESTIMATE OF TREATMENT EFFECT

CONFIDENCE INTERVALS (CI)

In choosing the appropriate test Correctly interpreting the results

COMMON OUTCOME MEASURES

Categorical Continuous Survival

H0: No association between variables H1: Association between variables

ISIS TRIAL OF ASPIRIN TO PREVENT MORTALITY AFTER MI

ISIS TRIAL OF ASPIRIN TO PREVENT MORTALITY AFTER MI

No Aspirin 1016 (E=910.7) 7584 (E=7689.3) 8600

- Strong evidence of an association between aspirin and mortality

MEASURES OF TREATMENT EFFECT

ODDS RATIO & ODDS REDUCTION

Odds ratio = (804 x 7584) / (7783 x 1016) = 0.77

Estimate of treatment effect

Odds reduction = 23%

SUMMARISING BINARY DATA IN TWO GROUP PROSPECTIVE STUDY

Measures of the centre of the distribution

Use Students t-test

ANOVA if comparing >2 treatment groups

As a rule, non-parametric tests are more general, but less sensitive

STUDY COMPARING TWO ANTIHYPERTENSIVE DRUGS ON BP

Mean 91mmHg 95mmHg

TESTING FOR A DIFFERENCE

Treatment A: N=41, Mean=91mmHg, SD=5.5 Treatment B: N=43, Mean=95mmHg, SD=5.5

MEASURE OF TREATMENT EFFECT

Difference = 91 95mmHg = -4mmHg (favours treatment A) 95% CI: -6.39 to -1.61mmHg

Why are survival data different?

KAPLAN-MEIER SURVIVAL ANALYSIS

EXAMPLE SURVIVAL FUNCTION

.2 Survival Function 0.0 0 20 40 60 80 100 120 Censored

AIM-HIGH TRIAL OF INTERFERON FOR MALIGNANT MELANOMA

COMPARING SURVIVAL BETWEEN GROUPS

SURVIVAL IN MELANOMA: INTERFERON VS. OBSERVATION

MEASURE OF TREATMENT EFFECT

HR=0.94 (CI=0.75 1.18)

ANALYSES GOOD PRACTICE

Report the primary/secondary outcomes as stated in the protocol

Dont give minor endpoints undue prominence in the paper

Solution: Dont have too many endpoints

ANALYSES GOOD PRACTICE

P>0.05 is not evidence of absence of effect, merely absence of evidence of an effect

You might also like