You are on page 1of 13

STATISTICS IN PLAIN ENGLISH BY TIMOTHY C.

URBAN

Summary of Statistical significance, effect size and confidence intervals


(chapter 7)
By
BAKARE, Taofeeqah Ifelade.
2nd June, 2021.
INTRODUCTION

 Statistical significance, effect size and confidence intervals are three of the common tools used by
researchers to reach important conclusions about how well sample statistics generalize to the larger
population.
 Statistical significance provides a measure to help us decide whether what we observe in our
sample is also going on in the population that the sample is supposed to represent.
 Effect size is a measure of how large an observed effect is without regard to the size of the sample.
 Confidence Intervals tells you how confident you can be that the results from a poll or survey
reflect what you would expect to find if it were possible to survey the entire population.
DESCRIPTIVE AND INFERENTIAL STATISTICS

 Descriptive statistics can only summarize a sample’s characteristics, inferential statistics use
your sample to make reasonable guesses about the larger population.
 With inferential statistics, it’s important to use random and unbiased sampling methods. If your
sample isn’t representative of your population, then you can’t make valid statistical inferences.
 Example of Descriptive and Inferential Statistics: you might stand in a mall and ask a sample of
100 people if they like shopping at Shoprite . You could make a bar chart of yes or no answers
(that would be descriptive analysis) or you could use your research (inferential statistics) to
reason that around 75-80% of the population (all shoppers in all malls) like shopping at
Shoprite. 
HYPOTHESIS

A null hypothesis is a type of conjecture used in statistics that proposes that there is
no difference between certain characteristics of a population or data-generating
process. The alternative hypothesis proposes that there is a difference.
A two-tailed alternative hypothesis does not include any speculation about whether
the sample mean will be larger or smaller than the population mean, only that the two
differ.
A one-tailed alternative hypothesis includes speculation about which value will be
larger i.e it is directional.
TYPE 1 ERROR

 It means rejecting the null hypothesis when it’s actually true.


 It can be reduced by setting a lower significance level.
 It is also known as false positive.
P-VALUE

 Probability of obtaining results at least as extreme as the observed results of a statistical


hypothesis test assuming that the null hypothesis is correct.
 Smaller p-value means there is strong evidence in favor of the alternative hypothesis.
 P > 0.05 – probability that null hypothesis is true and there’s no effect observed.
 P < = 0.05 – test hypothesis is false and should be rejected.
STANDARD ERROR AND EFFECT SIZE

•We
  calculate the standard error by dividing the standard deviation by the square root of the
sample size.
Standard error of the mean (SEx) =
To calculate an effect size, we first convert the standard error back into a standard deviation by
multiplying the standard error by the square root of the sample size then we calculate effect size
by dividing the difference between two or more group means by the standard deviation.
Effect size () =
Random sampling error is the error caused by a particular sample not being representative of the
population of interest due to random variation.
CONFIDENCE INTERVALS IN DEPTH

 The interval has an associated confidence level that gives the probability with which an
estimated interval will contain the true value of the parameter.
 We always use the alpha level for the two-tailed test to find our t value, even if we had a one-
tailed alternative hypothesis when testing for statistical significance.
 Researchers want to be either 95% or 99% confident that the confidence interval contains the
population parameter. These values correspond with p values of .05 and .01, respectively. The
formulas for calculating 95% and 99% confidence intervals are provided below:
CONFIDENCE INTERVALS

 CI95 = X ± (t95) * (SE(x))


 CI99 = X ± (t99) * (SE(x))
 Where:
 CI95 = a 95% confidence interval
 CI99 = a 99% confidence interval
 X = sample mean
 SE(x) = standard error of mean
 T95 = the t value for a two-tailed test, alpha level of .05 with a given degrees of freedom
 T99 = the t value for a two-tailed test, alpha level of .01 with a given degrees of freedom
EXAMPLE ON CONFIDENCE INTERVAL


Construct a 98% Confidence Interval based on the following data: 45, 55, 67, 45, 68, 79, 98, 87,
84, 82.

Step 1: Find the mean, μ and standard deviation σ, for the data.

σ: 18.172.

μ: 71

Set these numbers aside for a moment.

Step 2: Subtract 1 from your sample size to find the degrees of freedom (df). We have 10 numbers
listed, so our sample size is 10, so our df = 9. Set this number aside for a moment.

Step 3: Subtract the confidence level from 1, then divide by two. This is your alpha level.

(1 – .98) / 2 = .01
EXAMPLE ON CONFIDENCE INTERVAL


Step 4: Look up df (Step 2) and α (Step 3) in the t-distribution table. For df = 9 and α = .01, the table
gives us 2.821.

Step 5: Divide your std dev (step 1) by the square root of your sample size. 

18.172 / √(10) = 5.75

Step 6: : Multiply step 4 by step 5.

2.821 × 5.75 = 16.22075

Step 7: For the lower end of the range, subtract step 6 from the mean (Step 1).

71 – 16.22075 = 54.77925

Step 8: For the upper end of the range, add step 6 to the mean (Step 1). 

71 + 16.22075 = 87.22075
PRACTICAL EXAMPLE ON CONFIDENCE
INTERVAL
 An example of the U.S. Census Bureau which routinely uses confidence levels of 90% in
their surveys.
 ”The number of people in poverty in the United States is 35,534,124 to 37,315,094"
means (35,534,124 to 37,315,094) is the confidence interval.
 Assume the Bureau repeats the survey 1,000 times, the confidence levels of 90% means
that the stated number is between (35,534,124 to 37,315,094) at least 900 times. Maybe
36,000,000 is people in poverty, maybe less a bit, maybe greater a bit, any number in the
interval is as expected.
CONCLUSION

 Relying on information from a sample will always lead to some level of uncertainty.
 Confidence Interval is a range of values that tries to quantify this uncertainty:
 For example: 95% CI means that under repeated sampling 95% of CIs would contain the
true population parameter.
 CI quantifies better precision with large sample size.

THANK YOU FOR LISTENING.

You might also like