Professional Documents
Culture Documents
A typical setting of the NHST paradigm What we intend to do with statistical inference
• Specification of H0 and H1
• Calculate the p-value or 95% confidence interval (ci)
• Reject H0 if p < 0.05 or ci does not contain zero so that H1 is accepted; otherwise
accept H0.
• By accepting H0 we assume/conclude that the difference or the treatment effect
found in sample data is due to chance; otherwise, we claim that the
difference/effect is ‘real’ or ‘true’.
• Type I error (the α value); Type II error (the β value); statistical power (i.e., 1-β);
sample size; effect size.
NHST paradigm which became into being around 1950s, is actually a hybrid logic of
R.A. Fisher’s logic of significance testing and J. Neyman and E.S. Pearson’s logic of
hypothesis testing, gradually becoming the sine qua non of scientific research [2][3]. A hypothetical example of NHST using G*Power 3.1
• Statistical analysis results based on a single set sample data can hardly
provide confirmatory evidence about scientific findings [6][7].
• Given all other things are correct (e.g., data quality, model specification and
assumption conditions, etc.), calculation of p-values at least depends on three • In real life cases, however, H0 (i.e., no difference or everything are equal) is almost always false
elements: the raw effect measure (e.g., mean difference, correlation, odds ratio, so that type I error α=0 [5][6]. In these cases, calculation of p-values is conceptually
etc.); variance of the effect; and sample size [7]. In particular, increasing sample inappropriate which is the fundamental reason for the large sample size dilemma (with a
size always results in decreasing p-value given other things unchanged[5]. sufficiently large sample size you can make any result become statistically significant! [5])
Moving to a world beyond ‘p < 0.05’: Statistical data analysis without significance test Analysis of a hypothetical ANOVA case:
• Context is king in statistics! “Whatever the statistics show, it is fine to suggest reasons for your results,
but discuss a range of potential explanations, not just favoured ones. Inferences should be scientific, and that goes
far beyond the merely statistical. Factors such as background evidence, study design, data quality and understanding
of underlying mechanisms are often more important than statistical measures such as P values or intervals.” (citation of [7])
We shall not attempt to find/develop a magic alternative to NHST because it does not exist [3][4].
It is what-if analysis (NOT confirmatory analysis) the very nature of statistical inference for one (nonrepetitive) set of sample data! [8]
References:
[1] Ronald L. Wasserstein, Allen L. Schirm & Nicole A. Lazar (2019). Moving to a World Beyond “p<0.05”. The American Statistician, Vol. 73, No. S1, 1-19: Editorial.
[2] Steven N. Goodman (1999). Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy. Annuals of Internal Medicine, Vol. 130(12), pp 995-1004. Contact details: Dr. John Xie
[3] Gerd Gigerenzer (1993). The Superego, the Ego, and the Id in Statistical Reasoning. Print publication date: 2002; DOI: 10.1093/acprof:oso/9780195153729.001.0001. (Statistics Support Officer, Quantitative Consulting Unit)
[4] Sander Greenland (2017). Invited Commentary: The Need for Cognitive Science in Methodology. American Journal of Epidemiology, Vol. 186, No. 6 Phone: +61 2 69332229
[5] R.E. Kirk, Practical significance: A concept whose time has come (1996), Educational and Psychological Measurement, 56, 746-759.
[6] Marks R. Nester (1996). An Applied Statistician’s Creed. Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 45, No. 4, 401-410. Email: gxie@csu.edu.au
[7] Valentin Amrhein, Sander Greenland, Blake McShane (2019). Retire statistical significance. Nature, Vol. 567, 305: Comment. Website: https://www.csu.edu.au/qcu
[8] Valentin Amrhein, David Trafimow, & Sander Greenland (2019). Inferential Statistics as Descriptive Statistics:There Is No Replication Crisis if We Don’t Expect Replication.
The American Statistician, Vol. 73, No. S1, 262-270, DOI: 10.1080/00031305.2018.1543137
www.csu.edu.au