Professional Documents
Culture Documents
Session 2
1
Representation of distribution (examples)
A finite number of observations can be summarized and represented by a
frequency distribution.
Bar graph
Histogram
2
Source: IBM SPSS
Probability distribution
A sizeable/infinite population of observations is often summarized
and represented by a probability distribution, i.e., in terms of the
probability or relative frequency of occurrence of a variable’s
values.
Examples
3
Probability distribution
• The probability that a continuous variable (e.g., reaction time,
intelligence) is exactly of a particular value is zero
• The probability distribution of a continuous random variable can be
formally described by a probability density function (PDF), which
indicates the probability of the variable being close to a specified value
• The histogram of a sample of data from a population can be taken as
an approximation of the population PDF curve.
PDF curve
4
Probability distribution
The probability that a continuous random variable
is between two specified values is equal to the
area under the PDF curve over that interval.
X Y
A B C D
5
Normal distribution
• Many naturally occurring variables are close to being
normally distributed
• The probability density of a normal distribution is precisely
defined by a formula
• A bell-shaped probability density curve does not
necessarily represent a normal distribution.
(Cumming & Calin-
Jageman) standard normal distribution
6
Standardized normal distribution
.6915 .8790
7
Example
What is the probability that a
standardized random
variable is greater than 1.48 A
(i.e., the area B under the 0.0694
standard normal distribution B
curve)?
1.48
9
Random sampling
• Population: the entire collection of units/observations to
which a research’s conclusions are intended to apply
11
Sampling distribution
The distribution of all possible outcomes of a
sampling design
Example:
Expected proportion of random samples
(N = 5) from an adult population with
equal proportions of men and women
12
Sampling distribution
• If we obtain the values of a sample statistic (e.g., sample mean) from a large
number of random samples (of the same sample size) independently drawn
from a population, the distribution of the obtained values will be an
approximation of the statistic’s sampling distribution
• Conceptually, the sampling distribution is the distribution expected for that
statistic if we drew an infinite number of random samples (of the same sample
size) from the population and calculated the statistic on each sample (Howell).
Normally distributed
(Rouaud)
14
Sampling distribution of the mean
• Central limit theorem (CLT)
(Howell)
(Warner)
15
Hypothesis testing (example)
A decision-making process in which hypotheses are evaluated statistically.
Example:
• The academic self-efficacy (hereafter referred to as self-efficacy) of all university students
in a region (the population) as measured by a well-established instrument: mean
score = 38.76, SD = 6.31 (a higher score for higher self-efficacy)
• The mean self-efficacy score of a sample of 52 students who went through a
mindfulness-based intervention was 40.58. The SD of the mindfulness-trained
population’s scores is assumed to be 6.31 (same as the main population’s SD)
• Both populations’ scores are assumed to be normally distributed.
Research question: Whether self-efficacy scores differ across the intervention
participants and non-participants (main population)?
Main (non- Intervention
Research hypothesis (H1 ): The intervention participant)
population
participant
population
makes a difference in self-efficacy
aforementioned assumptions,
Standard error
the expected mean of the (SE) = 6.31/ SQRT (52)
sampling distribution = 38.76; = 0.875
• Since a z-value of 2.08 (as observed) is beyond the range of ±1.96, the
occurrence probability of such a z-score is below 5%
• If this probability criterion (< 5%) was preset for rejecting the Ho, the Ho
can be rejected, i.e., the statement that the intervention participant
population has a mean self-efficacy score of 38.76 can be rejected
• Interpretation: The mean self-efficacy score of students after the
intervention is significantly greater than the main (non-participant)
population’s mean. 19
Statistical significance
• When a study’s results are statistically significant, it means that if the null
hypothesis is true, it is unlikely that the sample results would have turned out
• The criterion for statistical significance or the alpha level (e.g., 5%) should be
preset
• Critical value of the test statistic (e.g., z or t): the value that separates the region
of rejection and the region of non-significance in the sampling distribution
• p value: the probability of obtaining the test statistic value observed or a more
extreme value if the null hypothesis is true.
Probability
density
(source:
Hatcher)
Critical value 20
p value (referring to the example)
• An example of computer output for the mindfulness intervention case (the R
command is not in the scope of this course)
• This p value corresponds to the probability of observing the test statistic value
(2.08) or a more extreme value if the null hypothesis is true.
21
Statistical significance
• If the p value of the sample’s test statistic is no more than alpha (), i.e., the test
statistic is in the region of rejection:
the null hypothesis is rejected
the results are statistically significant; the research hypothesis is supported
• If the p value of the sample’s test statistic is more than alpha (), i.e., the test
statistic is in the region of non-significance:
the null hypothesis is not rejected (but not “accepted” either)
the research hypothesis is not supported by the results.
22
Reporting guidelines
Wilkinson, L. (1999). Statistical Methods in Psychology Journals: Guidelines
and Explanations. The American Psychologist, 54(8), 594–604:
• It is hard to imagine a situation in which a dichotomous accept-reject
decision is better than reporting an actual p value or, better still, a
confidence interval
• Never use the unfortunate expression “accept the null hypothesis”
• Always provide some effect-size estimate when reporting a p value.