Professional Documents
Culture Documents
Statistical Fallacies and Errors in Medical Research
Statistical Fallacies and Errors in Medical Research
ERRORS IN MEDICAL
RESEARCH
By
By Muhammad
Muhammad Irfan
Irfan Abdul
Abdul Jalal
Jalal
OUTLINE OF THE TALK
• Motivation
• Statistical Questions & Approaches
• Criticisms on frequentist methodology
• Bayesian Paradigm
• Recent Bayesian methodological developments
SAMPLE SIZE ISSUES
• Consequences of small sample sizes: Large treatment effects that are not replicable in
subsequent research (Pereira, Horwitz, Ioannidis 2012)
STUDY DESIGNS
• Biased samples
FAILURE TO CHECK FOR BIASES
• Selection bias and confirmation bias
• Simpson Paradox, Berksonian Bias and Hawthorne Effect
• Lead-time bias
P VALUE MISINTERPRETATIONS
• Part of the Null Hypothesis Significant Test (NHST) paradigm.
• “Fathers” of NHST – Ronald Fisher, Egon S Pearson (Karl Pearson’s son), Jerzy Neyman
• Followed the philosophy of falsification (Karl Popper) – you can falsify but not confirm the
hypothesis that all swans are white by showing a single black swan.
• You can REJECT or FAIL TO REJECT A NULL HYPOTHESIS (H0)
P VALUE MISINTERPRETATIONS
• Twelve misconceptions of p values (Goodman 2008):
P- HACKING
• Causes (Head et al. 2015):
• Evidence : P Curve
P- CURVE
MULTIPLE COMPARISONS
• Example: Stepwise Regression Modelling
• Problems:
Bias in parameters estimation
Inconsistencies among model selection algorithms
Inappropriate focus or reliance on a single best model
• Remedies:
Avoid Stepwise regression. Use other more appropriate variable selection techniques such as
Lasso and Elastic Net approaches.
For model fit assessments, use a separate data set (test set) or cross-validation (leave one out
cross-validation, bootstrap cv)
OVER-INTERPRETATION OF NON-SIGNIFICANT
RESULTS
STATISTICAL MODEL ASSUMPTIONS
• Linear regression (residual checking – linearity, independence, normality, equal variances
(LINE))
• Proportional hazard model (proportional hazard check – Schoenfeld residuals)
• Influential diagnostics for influential points (plot of studentised residuals, deviance statistics,
dfbeta etc)
• Nonparametric test – based on ranking of observations (homogeneity of variance assumption
is critical here)
• What are other alternatives when model assumptions are not fulfilled?
Weighted least square regression (heteroscedasticity of residuals) since Ordinary Least
Square method is no longer the “Best Linear Unbiased Estimator (BLUE)”.
Time varying covariate model (when PH assumption is violated)
Remove or retain influential observations? (should be prespecified at the pre-analysis stage)
Use of more robust statistical methods that are not reliant on statistical model assumptions)
CORRELATION AND CAUSATION
CIRCULAR INFERENCE
• Also known as “double dipping”
CIRCULAR INFERENCE
STATISTICAL ISSUES IN MACHINE
LEARNING
STATISTICAL QUESTIONS
• What is our estimate of the prevalence of diabetes in Malaysia?
(Point estimation problem)
• What is a plausible range of values for our estimated prevalence of
diabetes in the Malaysian population to reflect our degree of
uncertainty? (Interval estimation / uncertainty quantification)
• What is the probability of a future patient whose BMI is 27, age is
56 and experienced gestational diabetes to be a diabetic
(prediction problem)
• Is the prevalence of diabetes rising? (hypothesis testing problem)
STATISTICAL APPROACHES
• Frequentist and Bayesian methods address all these types of
problems: point estimation, interval estimation, prediction and
hypothesis testing. But they utilize different approaches to do so.
• Some typical frequentist approaches:
-Least squares method (point estimation)
-Maximum likelihood estimation (point estimation)
-Confidence interval (interval estimation)
-Test statistics and p values (hypothesis testing)
• Bayesian statistics doesn’t use all these familiar methods.
BAYESIAN STATISTICS: A VERY BRIEF
HISTORY
• The work of Reverend Thomas Bayes (1702-1761) was published in
1764, 3 years after his death.
• Bayes’ solution to a problem of “inverse probability” was presented in
the Essay towards Solving a Problem in the Doctrine of Chances (1764)
which was published posthumously by his friend, Richard Price, in the
Philosophical Transactions of the Royal Society of London.
• This work gives a key result in Bayesian Statistics: BAYES THEOREM
• Over the course of the next 100-150 years, it received little attention
• In fact some key figures in statistics – e.g R.A. Fisher- outrightly
rejected the idea of Bayesian statistics.
• During WW2, some of the world leading Mathematicians resurrected
Bayes’ rules in deepest secrecy to crack the coded messages of the
Germans.
THOMAS BAYES (PILFERED FROM
WIKIPEDIA)
BAYESIAN STATISTICS: A BRIEF HISTORY
• Alan M. Turing (1912 – 1954) – mathematician working at Bletchley
Park (The Imitation Game, portrayed by Benedict T.C. Cumberbatch)
• Designed the bombe – an electro-mechanical machine for testing
every possible permutation of a message produced on the Enigma
machine – could take up to 4 days to decode a message
• New system: Banburismus (named after where the work was done -
Banbury, England) – where Bayesian methods [using Bayes’ factor]
were used to quantify the belief in guesses of a stretch of letters in an
Enigma messages
• Certain permutations that were unlikely to be the original message
were “thrown out” before they were even tested.
• Greatly reduced the time it took to crack Enigma codes
ALAN TURING (WIKIPEDIA)
CRITICISMS OF FREQUENTIST APPROACH: BMI
EXAMPLE
• spooled = 4.099.
CRITICISMS OF FREQUENTIST APPROACH: BMI
EXAMPLE
•• t
= = 2. 147
• Using R, we obtain the p value = 0.00448 (based on 19 df)
• Therefore we reject H0 at 5% level and thus conclude that there is a
significant difference between the mean BMI of these two
populations. However if we work at the 1% level, we have to retain
Ho!
• This indicates the arbitrariness of fixing the level of significance at
0.05.
• Besides, what is the real interpretation of p value?
R.A FISHER’S TAKE ON P < 0.05
28.17 2.201
giving (25.28, 31.05)
BMI EXAMPLE: CONFIDENCE INTERVAL
INTERPRETATION
=k , 0< <1
• We recognized the posterior distribution is a Beta distribution (to be
more specific, Beta ( 86, 6) )
• The posterior distribution is from the same family as prior distribution
(i.e. both prior and posterior are beta distribution).
• We can use the posterior distribution to obtain the posterior estimate
for parameter and for inference purposes (e.g. to obtain 95%
Bayesian Credible Interval for )
BAYES FACTOR
•• A counterpart to p value in the frequentist paradigm
• Pr ()= Pr () X
• Posterior odds = Bayes factor (B12) x prior odds
• If alternative hypothesis is H1 and null hypothesis is H2, then the
Bayes factor can be interpreted as follows (Kass and Raftery 1995):