ED710 – 301
CHAPTERS 3&4
SPRING 2023
ALICIA V. FISCHER
AGENDA
Check-In
Power through or breaks?
Chapter 3
Chapter 4
Homework
CHECK-IN: HOW ARE WE FEELING TODAY?
CHAPTER 3: CENTRAL TENDENCY
Central Tendency: Statistical measure to determine a single score that defines
the center of a distribution. The goal of central tendency is to find the single
score most typical or most representative of the entire group.
Mean
Median
Mode
MEAN (AVERAGE)
Symbol for population mean is µ (mu)
Symbol for sample mean is x̄ (x-bar), published
LEARNING OBJECTIVES: research uses M
MEAN DEFINITION: sum of scores divided by
Define the mean and calculate both the number of scores
population and sample means. o Formula for population: µ =  / N
Explain alternative definitions of the mean
as the amount each individual receives
o
Formula for sample: M =  / n
when the total is divided equally and as a
balancing point.
Find n, , and M using scores in a frequency
distribution table.
Describe how the mean is affected by each
of the following: changing a score,
adding/removing a score,
adding/subtracting a constant from each
score, and multiplying/dividing each score
by a constant.
MORE ON THE MEAN
Weighted Mean
Combine two sets of scores and find the overall average
To calculate you’ll need: the overall sum of scores for the
combined group, and the total number of scores in the
combined group
Alternative Definitions of the Mean Formula for overall mean: M =  (overall sum for the
combined group) / n (total number in combined group)
Dividing the total equally OR: X1 + x2 / (n1 + n2)
Mean as a balance point – the total distance EXAMPLE:
below the mean is the same as the total Two sections of ED710 take the same midterm. One section
distance above the mean earns the following scores: 62, 64, 67, 68, 75, 76. The second
section earns the following scores: 80, 84, 86, 90, 93, 98.
What is the average score for section 1?
What is the average score for section 2?
What is the weighted (overall) average?
COMPUTING THE MEAN FROM A
FREQUENCY DISTRIBUTION TABLE
Remember to use all columns
of information to calculate
your mean.
What is the average for this
sample?
CHARACTERISTICS OF THE MEAN
Every score adds to the total  AND total
number of scores n
Changing a score
Changing a value will change the mean
Introducing/Removing a score
Adding a new score or removing a score will
usually change the mean
Exception is when the new score (or removed
score) is exactly equal to the mean
Remember the balance point
Adding a score has the same effect whether
the set of scores is a population or sample
CONSTANTS AND THE MEAN
Multiplying/Dividing by a
Constant
If every score is multiplied/divided by a
Adding/Subtracting a constant, the mean will change in the
Constant same way.
If a constant value is added to every Common for changing the unit of
score in a distribution, the same measurement
constant will be added to the mean
o E.g., seconds to minutes, inches to
If you subtract a constant from feet
every score, the same constant will
o Although the numerical values for
be removed from the mean individual scores have changed, the
actual measurements have not
changed.
LEARNING RECAP P. 78
What is the mean for the following sample: 1, 2, 5, 4?
12
6
4
3
One sample has n=8 scores and M=2. A second sample has n=4 scores and M=8. If the two samples are
combined, what is the weighted mean?
3
4
5
6
What is the mean for the population of scores shown in the frequency distribution table?
1.5
3.0
2.9
5.8
After 5 points are added to every score in a distribution, the mean is calculated and found to be  =30. What
was the value of the mean for the original distribution?
25
30
35
Cannot be determined from the information given
MEDIAN
Finding the Median for Most
LEARNING OBJECTIVES Distributions
Scores are divided into two equal size groups –
Define and calculate the median, and what is the midpoint between them.
find the precise median for a NOT the midpoint between highest and lowest
continuous variable scores.
MEDIAN: Median is the midpoint on a Simple when distribution has odd number of
list of scores in order from smallest to scores:
largest. The median is the point on the o 3, 5, 8, 10, 11
measurement scale below with 50% of When the distribution has an even number of
the scores in the distribution are scores, you take the average of the middle two
located. scores:
o 1, 1, 4, 5, 7, 8
FINDING THE PRECISE MEDIAN FOR A CONTINUOUS
VARIABLE
Continuous variables have limits on a
histogram, so a score of 4 ranges from 3.5-
4.5; median corresponds to a point within
that interval
To find the precise median, calculate the
fraction of boxes needed within an interval:
Fraction = number needed to reach
50% / number in the interval
Finding the precise median is only
applicable to continuous variables, not
discrete variables.
LEARNING RECAP P. 82
For the sample in the frequency
distribution table, what is the median?
What is the median for the following set of 3
scores: 3, 4, 6, 8, 9, 10, 11? 3.5
7 4
7.5 7.5
8
8.5
What is the median for the following set of
scores: 8, 10, 11, 12, 14, 15
11
11.5
12
11.67
MODE
What is the Mode?
No symbols or notation to denote mode in a sample or
population
Can be used to determine the typical or most frequent value
for any scale of measurement
You cannot obtain a sum of x and you cannot list them in
order
Only measure of central tendency directly related to scores
LEARNING OBJECTIVE in a distribution (not calculated)
Define and identify the mode(s) for a Greatest frequency will be shown in the tallest part of a
distribution, including the major and minor graph
modes for a binomial distribution Distribution with two modes is called “bimodal”
MODE: score or category that has the More than two is called “multimodal”
greatest frequency Mode can be used more casually, can refer to scores with
relatively high frequencies
When two modes have unequal frequencies, researchers can
differentiate by calling the taller peak the major mode and
the shorter peak the minor mode
LEARNING RECAP
During the month of October, an instructor recorded the number of absences for
each student in a class of n=20 and obtained the following distribution. What are
the values for the mean, median, and mode for this distribution?
Mean = 2.35, median = 2.5, mode = 3
Mean = 2.5, median = 3, mode = 3
Mean = 2.35, median = 3, mode = 7
Mean = 2.5, median = 2.5, mode = 7
What is the mode for the following set of n=8 scores? 0, 1, 1, 2, 2, 2, 2, 3
2
2.5
1.625
13
For the sample shown in the frequency distribution table, what is the mode?
3
3.5
4
5
SELECTING A MEASURE OF CENTRAL
TENDENCY
LEARNING OBJECTIVE:
Explain when each of the three measures of central tendency should be
used, identify the advantages and disadvantages of each, and describe
how each is presented in a report of research results
WHEN TO USE THE
MEDIAN
Extreme scores or skewed distribution
Undetermined Values
It is impossible to compute an average
when we have an undetermined score
Open-ended distributions
When there is no upper or lower limit
for one of the categories
E.g., 5 or more, 3 or less
Ordinal Scale
NEVER compute the mean for an
ordinal scale
You can determine direction but not
distance when using ordinal data
WHEN TO USE THE MODE
Nominal scales
Nominal scales do not measure quantity, therefore it is impossible to
compute mean or median for nominal data
E.g., favorite color, restaurant, college attended
Discrete variables
Whole, indivisible categories
It is possible to calculate means, but may not be accurate or situated in
reality
E.g., number of children in a family cannot be 2.5
Describing shape
Often included as a supplementary measure because it gives us a sense of
the distribution without a visual
PRESENTING MEANS AND
MEDIANS IN GRAPHS
Modes are rarely shown in a graph
When considering a graph type,
keep the following in mind:
Height of graph should be 2/3 to ¾ of
its length
Start numbering both x and y axes
with 0 where the two axes intersect
When 0 is part of the data, it is
common to more 0 point away from
the intersection
LEARNING RECAP P. 91
A teacher gave a reading test to a class of 5th graders and computed the mean, median, and mode for the test scores. Which of
the following statements cannot be an accurate description of the scores?
The majority of students had scores above the mean
The majority of students had scores above the median
The majority of students had scores above the mode
All of the other options are false statements
One item on a questionnaire asks, “How many siblings did you have when you were a child?” A researcher computes the mean,
median, and mode for a set of n=50 responses. Which of the following statements accurately describes the measures of central
tendency?
Because the scores are whole numbers, the mean will be a whole number
Because the scores are whole numbers, the median will be a whole number
Because the scores are whole numbers, the mode will be a whole number
All of the options are correct
The value of one score in a distribution is changed from x=20 to x=30. Which measure(s) of central tendency is/are certain to be
changed?
Mean
Median
Mean and median
Mode
CENTRAL TENDENCY AND THE SHAPE OF
THE DISTRIBUTION
LEARNING OBJECTIVE
Explain how the three measures of central tendency are related to each other for
symmetrical and skewed distributions
CENTRAL TENDENCY AND THE SHAPE OF THE
DISTRIBUTION
Symmetrical Distributions
Right-hand side is mirror image of left
Median is exactly at the center
Mean is at the center
If a distribution is roughly symmetrical but not
perfect, the mean and median will be close
together in the center of a distribution
If there is only one mode, it will also be at the
center
Bimodal or multimodal distributions will have
mean and median at the center, and mode will be
at each peak
Rectangular distribution will have no mode
because all x values occur with same frequency;
mean and median will remain in center
Skewed Distributions
Mean, median, and mode will likely be in different
areas
Negatively skewed (b)
Positively skewed (a)
LEARNING RECAP P.94
Which of the following is true for a symmetrical distribution?
Mean, median, and mode are all equal
Mean = median
Mean = mode
Median = mode
For a negatively skewed distribution with a mode of x=25 and a median of 20, the mean is probably
Greater than 25
Less than 20
Between 20 and 25
Cannot be determined from the information given
A distribution is positively skewed. Which is the most probable order for the three measures of central
tendency?
Mean = 40, median = 50, mode = 60
Mean = 60, median =50, mode=40
Mean =40, median = 60, mode=50
Mean=50, median=50, mode=50
CHAPTER 4
VARIABILITY
INTRODUCTION TO VARIABILITY
VARIABILITY: Provides a quantitative measure of the
differences between scores in a distribution and
describes the degree to which the scores are spread
LEARNING OBJECTIVES
out or clustered together
Define variability and explain its use and Variability measures how well an individual score
importance as a statistical measure represents the entire distribution. Variability
Define and calculate the range as a simple provides information on how much error to expect if
measure of variability and explain its you are using a sample to represent the population
limitations
Three different measures of variability
o Range
o Standard deviation
o variance
RANGE
RANGE: Distance covered by the scores in a distribution, from the smallest
score to the largest score
Formula: range = Xmax - Xmin
This definition works well for variables with defined upper and lower boundaries
When scores are a measurement of a continuous variable, the range can be defined as
the difference between the upper real limit (URL) and the lower real limit (LRL) for the
smallest score
When scores are whole numbers, the range can also be defined as the number of
measurement categories (think discrete variables)
Xmax – Xmin + 1
Limitations of range
Determined by using two most extreme scores in a distribution, does not
give us a semblance of shape of the distribution
Does not consider all scores in a distribution, therefore does not give an
accurate description of the variability for the entire population
LEARNING RECAP P.103
1. Which of the following sets of scores has the greatest variability?
a. 2, 3, 7, 12
b. 13, 15,16,17
c. 24, 25, 26, 27
d. 42, 44, 45, 46
2. What is the range for the following set of scores: 3, 7, 9, 10, 12?
a. 3 points
b. 4 or 5 points
c. 9 or 10 points
d. 12 points
3. How many scores in the distribution are used to compute the range?
a. Only 1
b. 2
c. 50% of them
d. All of the scores
e.
DEFINING STANDARD DEVIATION AND
VARIANCE
Learning Objectives
Define variance and standard deviation and describe what is measured by
each
Calculate variance and standard deviation for a simple set of scores
Estimate the standard deviation for a set of scores based on a visual
examination of a frequency distribution graph of the distribution.
CALCULATING THE STANDARD DEVIATION
DEVIATION DEFINITION: Distance from the mean
Formula: deviation score = X - µ
If µ = 50, and your X =53, your deviation score is 3, or 3 points from the mean
You can have a positive or negative deviation score. Positivity or negativity
tells us whether the score is above or below the mean, the score tells us how
many points away from the mean
The next step in determining the standard deviation is to figure out the mean
of all the distances from the mean in a set of scores
EXAMPLE
Because the sum of the deviations is always zero,
the mean of the deviations is also zero
The average of the deviation of scores will not work
as a measure of variability because it is always zero.
The solution is to get rid of the positive and
negatives by squaring each distance from the mean
Using the squared values, you can compute the
mean squared deviation
This results in a squared distance; it is not easy
to understand or a descriptive measure
VARIANCE
VARIANCE: Mean of the
squared deviations.
Variance is the average
squared distance from the
mean
Final Step in determining
the standard deviation: take
the square root of your
variance
Standard deviation = 
EXAMPLE
EXAMPLE: Calculate the
variance and standard
deviation for the following
population of N=5 scores
1, 9, 5, 8, 7
LEARNING RECAP P. 107
1. Standard deviation is probably the most commonly used value to describe and measure
variability. Which of the following accurately describes the concept of standard
deviation?
a. The average distance between one score and another
b. The average distance between a score and the mean
c. The total distance from the smallest score to the largest score
d. One half of the total distance from the smallest score to the largest score.
2. What is the variance for the following set of scores: 2, 2, 2, 2, 2
a. 0
b. 2
c. 4
d. 5
3. Which of the following values is the most reasonable estimate of the standard deviation
for the set of scores in the following distribution?
a. 0
b. 1
c. 3
d. 5
MEASURING THE VARIANCE AND
STANDARD DEVIATION FOR A POPULATION
Learning Objectives
Calculate the Sum of Squares (SS) for a population using either
the definitional or computational formula and describe the
circumstances in which each formula is appropriate
Calculate the variance and the standard deviation for a
population
SUM OF SQUARED (SS) DEVIATIONS
Variance is defined as the mean of the squared deviations
Variance = mean of squared deviations = sum of squared deviations / number of scores
The value in the numerator (sum of squared deviations) is the basic component of variability
SS: sum of the squared deviations
Definitional formula: SS = 
Find each deviation score
Square each deviation score
Add the squared deviations
SS can be awkward to use when there are decimals or fractions involved
Use the computational formula:
SS = 
Square each score and add the squared values
Find the sum of scores, square this total and divide by N
Finally, subtract second part from the first part
Both formulas will produce the same value for SS
FINAL FORMULAS AND NOTATION
Variance = SS / N
Standard deviation = 
Standard deviation = 
Population variance = 2
Population standard deviation =  (sigma)
POPULATION VARIANCE: represented by sigma squared (2 )and equals the
squared distance from the mean. Population variance is obtained by dividing
the sum of squares by N
POPULATION STANDARD DEVIATION: represented by sigma () and equals the
square root of the population variance
LEARNING RECAP P. 111
1. What is the value of SS for the following population of N=4 scores: 1, 4, 6, 1
a. 0
b. 18
c. 54
d. 122=144
2. Each of the following is the sum of scores for the population of N=4 scores. For which population would
be the definitional formula be a better choice than the computational formula for calculating SS?
a. 
b. 
c. 
d. 
3. What is the standard deviation for the following population of scores: 1, 3, 7, 4, 5
a. 20
b. 5
c. 4
d. 2
MEASURING THE STANDARD DEVIATION
AND VARIANCE FOR A SAMPLE
The Problem with Sample
Learning Objectives
Explain why it is necessary to make a Variability
correction to the formulas for variance and Samples tend to be less variable than
standard deviation when computing these the populations
statistics for a sample
Calculate SS for a sample using either the Sample variability therefore tends to
definitional or computational formula and give a biased estimate of population
describe the circumstances in which each variability
formula is appropriate
Calculate the variance and standard We have to make adjustments to the
deviation for a sample formula to get a more accurate
representation for the population
FORMULAS FOR SAMPLE VARIANCE AND
STANDARD DEVIATION
Calculations follow the same steps for the population
Find SS
Calculate variance
Find square root of the variance
Change in notation – use M instead of  and n instead of N
Definitional formula: SS = 
Computational formula: SS = 
To correct bias in the sample, it is necessary to make adjustments to the formula and notation
Sample variance = s2 = SS / n-1
Sample standard deviation = s = 
s=
SAMPLE VARIANCE: represented by s2 and equals the mean squared distance from the mean. Sample
variance is obtained by dividing the sum of squares by n-1.
SAMPLE STANDARD DEVIATION: represented by s and is equal to the square root of the sample variance.
SAMPLE VARIABILITY AND DEGREES OF
FREEDOM
With a population, you find the deviation for each score by measuring the distance from the
population mean; with a sample, the value of the mean is unknown and you must measure
distances from the sample mean
Calculating based on a sample mean creates a restriction on the variability of the sample,
there are only a finite number of options of scores in a sample
DEGREES OF FREEDOM (DF): defined as n-1. The degrees of freedom determine the number
of scores in the sample that are free to vary.
LEARNING RECAP P. 116
1. If sample variance is computed by dividing by n, instead of n-1, how will the obtained values be related
to the corresponding population variance?
a. They will consistently underestimate the population variance
b. They will consistently overestimate the population variance
c. The average value will be exactly equal to the population variance
d. The average value will be close to, but not exactly equal to, the population variance
2. What is the value of SS for the following sample: 1, 4, 0, 1?
a. 36
b. 18
c. 9
d. 3
3. What is the variance for the following sample of n=4 scores: 2, 5, 1, 2?
a. 34.3 = 11.33
b. 9/4 = 2.25
c. 9/3 = 3
d. 
e.
SAMPLE VARIANCE AS AN UNBIASED
STATISTIC
BIASED AND UNBIASED STATISTICS
Unbiased sample statistic: average value of the
statistic is equal to the population parameter
LEARNING OBJECTIVES BIASED SAMPLE STATISTIC: average value of the
statistic either underestimates or overestimates the
Define biased and unbiased statistics corresponding population parameter
Explain why the sample mean and If the sample variance is computed by dividing by n,
sample variance (n-1) are unbiased the resulting values will not produce an accurate
statistics estimate of the population variance
o On average, sample variances underestimate
the population variance
Sample mean and sample variance (using n-1) are
unbiased statistics
o Helpful in inferential statistics
MORE ON VARIANCE AND STANDARD
DEVIATION
LEARNING OBJECTIVES
Describe how the mean and SD are represented in a frequency
distribution graph of a population parameter or sample distribution.
Explain how the mean and SD are affected when a constant is added to
every score or when every score is multiplied by a constant.
Describe how the mean and SD are reported in research journals.
Describe the appearance of a distribution based on the values for the
mean and SD.
Explain how patterns in sample data are affected by sample variance.
PRESENTING THE MEAN AND SD IN A FREQUENCY
DISTRIBUTION GRAPH
Vertical line labeled with  or M
denotes the mean
Standard deviation is represented
by a line or arrow drawn from the
mean and labeled with s or 
SD should extend approximately
halfway from the mean to the
extreme score
TRANSFORMATIONS OF SCALE
Adding a constant to a score does not change the SD
The distance between the scores is not changing
The distribution may shift up or down a number line, but the relationship
between the scores has remained the same
Multiplying each score by a constant causes the SD to be multiplied by the
same constant
The distance between scores has changed – multiplied by whatever
number the constant is
REPORTING THE SD
Researchers will provide descriptive
information for central tendency and
variability
Mean and SD are typically reported
together
Sometimes sample size n is reported for
each group
The purposes of tables are to present data in
an organized, concise, and accurate manner
STANDARD DEVIATION AND DESCRIPTIVE
STATISTICS
The SD is a descriptive measure – it describes a variable; how spread out the scores are in a
distribution
It is equally important to report the average as well as the spread/variability of the data
SD measures distance from the mean
Describing an Entire Population
Research reports will summarize data by listing the M and SD
Roughly 70% of scores are located within one SD from the mean
95% of scores are located within 2 SDs from the mean
Describing an Individual Score
If you are given the SD of an individual score, you should be able to visualize where on
the distribution the score is
Same if you are given the score itself
VARIANCE AND INFERENTIAL STATISTICS
The goal of inferential statistics is to detect
meaningful and significant patterns in
research
Variability of data plays an important role
because it influences how easy it is to see
patterns
Low variability means existing patterns can
be seen clearly, while high variability can
obscure patterns that exist
LEARNING RECAP P. 124
1. How is SD represented in a frequency distribution graph?
a. By a vertical line located one  above the mean
b. By two vertical lines one  above and below the mean
c. By a horizontal line or arrow extending one  above the mean to one  below the mean
d. By a horizontal line or arrow extending from the mean for a distance equal to one 
2. A population has a mean score of 35 and  of 5. After 3 points are added to every score in the population, what are the new values for the  and ?
a.  = 35,  = 5
b.  = 35,  = 8
c.  = 38,  = 5
d.  = 38,  = 8
3. What symbols are used for the mean and SD for a sample in a research report?
a. Mean is identified by the letter M and SD is represented by lowercase s.
b. Mean is identified by letter M and SD is represented by SD
c. Mean is identified by a lowercase letter m and standard deviation is represented by lowercase s.
d. Mean is identified by a lowercase letter m and the SD is represented by SD.
4. Under what circumstances would a score that is above the mean by 5 points appear to be very close to the mean?
a. When the mean is much greater than 5
b. When the mean is much less than 5
c. When the SD is much greater than 5
d. When the SD is much less than 5
5. For which of the following pairs of distributions would the mean difference be easiest to see?
a. M=45 with s=5 compared to M=50 with s=5
b. M=45 with s=5 compared to M=55 with s=5
c. M=45 with s=10 compared to M=50 with s=10
d. M=45 with s=10 compared to M=55 with s=10
STATA AND R
Basics of R: https://www.youtube.com/watch?v=FY8BISK5DpM (~15 minutes)
Basics of Stata: https://www.youtube.com/watch?v=YMt5K68ZvjQ (~30 minutes)
HOMEWORK
Ch 3 Qs: 2, 16, 22, 24, (pp. 96-97)
Ch 4 Qs: 2, 6, 18, 24 (pp. 128-130)