You are on page 1of 186

Chapter 9 (Smith And Davis)

Using Statistics to Answer Questions


(Includes 20 (I know! Tons!) Pre-Class Quiz Questions)
Overview of this Chapter
 The Good News and the Bad News
First up, the Bad News. It is time to look at statistics. Yes, that
means scales of measurement, measures of center, measures
of spread, and some basic inferential statistics
– The other Bad News is that this chapter gets pretty long, so
make sure to set aside a lot of time to read it

The Good News? You already did a lot of this in Methods One,
so most of this chapter will simply be a review!
Overview of this Chapter
 In this chapter, we will focus on the underlying mechanics of study
data, including …

Part One: Descriptive Statistics

Part Two: Measures of Variability

Part Three: Correlations (Brief)

Part Four: Inferential Statistics

Part Five: An Eye Toward The Future


Part One

Descriptive Statistics
Descriptive Statistics
 Descriptive Statistics
This chapter will give a brief overview of the statistics:
– Statistics is a branch of mathematics that involves the
collection, analysis, and interpretation of data
– Descriptive Statistics involve those procedures used to
summarize or “describe” data
– Inferential Statistics are procedures used to analyze data
after an experiment is completed in order to determine
whether the independent variable has a significant effect
Descriptive Statistics
 Descriptive Statistics – In this section, we will look at some new
items and review some old items.
The new items include …
– 1). Scales of Measurement
 Nominal Scale
 Ordinal Scale
 Interval Scale
 Ratio Scale
Descriptive Statistics
 Descriptive Statistics – In this section, we will look at some new
items and review some old items.
The old items include …
– 2). Measures of Central Tendency
 Mean
 Median
 Mode

 We’ll go through these items quickly, so review Salkind: Means to


an End (Chapter 2) from Methods One for a more in-depth look!
Descriptive Statistics
 1). Scales of Measurement
Measurement essentially means the assignment of symbols to
events according to a set of rules (scale of measurement).
You may recall these four scales from your statistics course …
– A. Nominal: Values differ in quality rather than quantity
– B. Ordinal: Ordered along a dimension; unknown distances
– C. Interval: Scale spacing is known; arbitrary zero point
– D. Ratio: Interval characteristics + an absolute zero point

– To remember them, I like the acronym NOIR


The Science of “NOIR”
 1). Scales of Measurement
A. Nominal: Values differ in quality rather than quantity
B. Ordinal: Ordered along a dimension; unknown distances
C. Interval: Spacing on a scale is known; arbitrary zero point
D. Ratio: Interval characteristics plus; absolute zero point
Descriptive Statistics – Nominal
 1). Scales of Measurement – NOIR
A. Nominal scales involve variables whose values differ in their
quality. In other words, you can list or categorize the levels
– There is no implied value in each level of the variable (one level is
not “better” or “worse” than another). Rather, they are based more
on qualitative differences. For example …
 Gender: Male / Female (neither is “better” or “worse”)
 Guilt: Guilty / Not Guilty
 Drink list: Soda, Coffee, Tea, Beer, Water, Juice
 Car list: Toyota, Chevrolet, Dodge, Ford, Saturn
Descriptive Statistics
 1). Scales of Measurement – NOIR
A. Nominal scales
– If you divide people into those expecting severe shocks vs.
mild shocks, will their group impact their tendency to 1) seek
out the company of others or 2) be alone or 3) not care?
– Stanley Schracter found that those expecting severe shocks
preferred to wait with other people in a waiting room; those
expecting a mild shock showed no preference
 Here, you can simply count the number (the frequency)
of participants in each condition! How many sat by
themselves compared to how many sat together
Schracter Study – Nominal Variable

Number of People Expressing These


Preferences (N = 62 total)
Wait Alone Wait with Don’t Care
(1) Others (2) (3)
Those expecting
3 20 9
a severe shock

Those expecting
2 10 18
a mild shock
Descriptive Statistics
 1). Scales of Measurement – NOIR
A. Nominal scales
– In the Schracter study, we could randomly assign values to
the levels of the dependent variable
 1 = Wait alone, 2 = Wait with others, 3 = Don’t care
– But keep in mind that these numbers are arbitrary in a
nominal scale. We could just as easily have
 1 = Don’t care, 2 = Wait alone, 3 = Wait with others OR
 1 = Wait with others, 2 = Wait alone, 3 = Don’t case
– Levels differ in quality (type), not value
Descriptive Statistics
 1). Scales of Measurement – NOIR
B. Ordinal scales involve variable levels that differ in terms of
rank (value!). For example, think about Marathon Racers
– In a marathon, you have first, second, and third place. You
know the finishing order, but ordinal scales do not give you
any info beyond rank order.
– The first and second place runners
may have finished seconds apart
while second place finished a full
five minutes ahead of third place
 All you know is rank order. Distance between is unknown
Descriptive Statistics
 1). Scales of Measurement – NOIR
B. Ordinal scales – You might ask a question like:
– If a child ranks 5 toys and is given the one ranked # 3, will he still
play with it in a week?
 The original ranking doesn’t tell you how good the toy is
– What about small appliances? Brehm was curious about
appliance rankings, so he conducted a study …
 A woman ranked eight small household appliances and then
received an appliance of her choice as a gift (albeit a mid-
ranked appliance, one she ranked 3rd 4th or 5th)
Descriptive Statistics – Ordinal Scale
 1). Scales of Measurement – Brehm & Appliance Attractiveness

 When asked to re-rank them, she ranked the appliance


she was given as the most desirable (even more so than
appliances that she originally ranked higher!)
Descriptive Statistics
 1). Scales of Measurement – NOIR
C. Interval scales provides additional information. Like ordinal
scales, it ranks various levels, but here you know more about
the degree of difference or distances between the levels
– Interval scales MAY include a zero point, but this zero point
does not imply a complete absence of the quantity
Descriptive Statistics
 1). Scales of Measurement – NOIR
C. Interval scales
– Imagine a jury study look at jurors’ confidence in guilt
 0 = not at all confident to 9 = very confident
 Giving a 0 here does not necessarily mean a “complete”
lack of confidence, though it might
– More importantly, you now have a scale that provides more
information about the difference between levels
 A difference between 2 and 3 is the same as the
difference between 7 and 8. Both differ by 1 point
Descriptive Statistics
 1). Scales of Measurement – NOIR
C. Interval scales
– The Likert-type scales you are probably familiar with often
use an interval scale
 On a scale of 1 (low) to 10, how often do you exercise?
 On a scale of 0 (low) to 5, how do you like college?
 On a 1 to 100 scale, how happy are you with Obama?
– As you will see shortly, we can do a lot more statistical tests
with interval scales than with nominal or ordinal scales
Descriptive Statistics
 1). Scales of Measurement – NOIR
C. Ratio scales are very similar to interval scales, in that they
have ranks, but a zero point here DOES imply an absence of
the quantity
– The number of correct answers on an exam (A score of zero
implies the total lack of correctness)
– Temperature (there is an absolute zero)
– Some scales cannot have a zero, like an
IQ test. A zero IQ doesn’t work (unless
you happen to be Homer Simpson)!
Descriptive Statistics
 1). Scales of Measurement – NOIR
Think about NOIR in the Kentucky Derby
– Horses have an assignment number
 Nominal Scale
– Finishing order includes a rank
 Ordinal Scale
– You can time a race from start (4:00 pm) to finish (4:04 pm)
 Interval Scale
– Total time for the winner to run the race (it may be physically
impossible to run it in zero seconds, but zero time exists!)
 Ratio Scale
Descriptive Statistics – A Review
 1). Scales of Measurement – Just to review …
Nominal Scales (also called “categorical scales” sometimes)
– Gender (Male or Female); Transportation (Bus, Car, Van)
Ordinal Scales (remember, this is based on “order”)
– Race finishing place (1, 2, 3, etc.); ranking favorite foods
Interval Scales (has known “intervals” between responses)
– Rankings + Knowledge of distance between items
– Attitudes; likelihood scales; confidence scales
Ratio Scales (a ratio can include zero – for example, 4 : 0)
– Rankings + Knowledge + Absolute Zero Point
– Time to finish something; correct responses (zero vs. more)
Pop Quiz: Pre-Class Quiz Question #1
 A researcher is interested in eyewitness memory. She has
participants watch a video reenactment of a robbery and then
gives them ten True / False questions about the event. She counts
the number correct responses. The responses are BEST
assessed using which scale of measurement?
A). Nominal
B). Ordinal
C). Interval
D). Ratio
Pop Quiz: Pre-Class Quiz Question #2
 A researcher wants to look at popular food items in the FIU
cafeteria. He asks students what their favorite food is. Which scale
of measurement would he use to describe the most popular foods
mentioned by the participants? (Note: this is a bit tricky, so really
think about it)
A). Nominal
B). Ordinal
C). Interval
D). Ratio
Pop Quiz: Pre-Class Quiz Question #3
 You want to determine whether the race of the defendant has an impact on
jury verdicts. You assign participants to watch a trial with either a Hispanic
or Caucasian defendant, and you measure on a 1 (not at all) to 7 (very)
scale how guilty the defendant is. What is the best DV scale of
measurement?
A). Nominal
B). Ordinal
C). Interval
D). Ratio
E). All of the above are okay to use
Pop Quiz: Pre-Class Quiz Question #4
 Let’s say you want your DV to be more relevant to the real legal
world, thus you just look at guilty vs. not guilty options. Which
scale of measurement should you use now?
A). Nominal
B). Ordinal
C). Interval
D). Ratio
E). All of the above are okay to use
Pop Quiz: Pre-Class Quiz Question #5
 Karen categorizes people in her sample as either male or female,
which represents which scale of measurement?
A). Nominal
B). Ordinal
C). Interval
D). Ratio
E). All of the above are okay to use
Pop Quiz: Pre-Class Quiz Question #6
 A researcher wants to gauge student preferences for a new
textbook. He gives students three books, and has them rank them
in order of preference. Which scale of measurement is the
researcher MOST LIKELY using
A). Nominal
B). Ordinal
C). Interval
D). Ratio
Pop Quiz: Pre-Class Quiz Question #7
 Number of errors on a math test represents a(n) __________
scale of measurement.
A). Nominal
B). Ordinal
C). Interval
D). Ratio
E). All of the above are okay to use
Descriptive Statistics
 Pause-Problem #1 (NOIR)
For this first Pause-Problem, I want you to come up with four
dependent variables, one related to each NOIR level. You can
use your lab study if you like (but you cannot use any of the
examples that I used in this lecture presentation)!
– What is your nominal variable?
– What is your ordinal variable?
– What is your interval variable?
– What is your ratio variable?
#1
Descriptive Statistics
 1). Scales of Measurement – Choosing which scale to use
When determining which scale of measurement to use, there
are several dimensions that researchers use, including …
– A. Information Yielded
– B. Statistics Tests
– C. Ecological Validity
Descriptive Statistics
 1). Scales of Measurement – Choosing which scale to use
A. Information Yielded
– Some variables provide more information than others
 Ratio scale provides the most information
 Interval is second
 Ordinal third
 Nominal scales provide the least information
– On your own, think about why you get more information with
the ratio scale than the rest of them
Descriptive Statistics
 1). Scales of Measurement – Choosing which scale to use
B. Statistical Tests
– Statistical tests for nominal and ordinal variables are limited,
based mostly on looking at frequencies of occurrence only
 Chi Square’s can use nominal and ordinal data

– Ratio and interval scales, on the other hand, provide more


useful data for advanced statistical tests
 ANOVA’s and t-Tests can use interval and ratio data
Descriptive Statistics
 1). Scales of Measurement – Choosing which scale to use
C. Ecological validity refers to how closely the dependent
measure reflects what people must do in real-life situations
– Verdicts in a real life jury case are either guilty or not guilty,
but since I cannot run ANOVA’s on nominal scales, I have
an easy way of asking it both ways (nominal and interval)
– Bracketed Variables!
 1, 2, or 3 is “Guilty” while 4, 5, or 6 is “Not Guilty”
1 2 3 4 5 6
Guilty Not Guilty
Descriptive Statistics
 2). Measures of Central Tendency (or Measures of Center)
Our second main descriptive statistic topic involves measures
of central tendency, or a “single score” used to represent the
magnitude of scores in a distribution
– That is, it represents a set of scores by identifying a number
near the center or middle of the distribution

– That’s right! It is time to review Measures Of Center (which


you should know all about from Methods One!)
Descriptive Statistics
 2). Measures of Central Tendency (or Measures of Center)
There are three main measures of center …
– A. Mode
– B. Median
– C. Mean
Descriptive Statistics
 2). Measures of Central Tendency – The Mode
A. The mode is the most frequent (commonly occurring) score
in a distribution. As such, it is the simplest measure of center
– Scores other than the most frequent are not considered
 Neglects the magnitude of scores in the distribution

– Given the limited information it provides, the mode is limited


in its application and value

– Nominal (categorical) scales from NOIR


Descriptive Statistics
 2). Measures of Central Tendency – The Mode Examples
A. The mode
– Most purchased food on a menu
 Pizza, hamburgers, salad, fish, meat loaf, etc.
– Number of men and women in our class
 Males, females
– Vehicles on the road
 Trucks, SUV’s, compacts, sedans, minivans, etc.
– Genres of movies
 Horror, Comedy, Action, Romance
Descriptive Statistics
 2). Measures of Central Tendency – The Mode Examples
A. The mode
– Measures of center are easy, right! Let’s just briefly review
what you learned in Research Methods and Design One

– Imagine we buy a bag of balloons. When we count them up,


we find there are 18 red, 12 blue, 25 orange, 24 purple, and
21 green balloons.
 What is our mode? Let’s do this as a quick pop quiz …
Pop Quiz: Pre-Class Quiz Question #8
 Imagine we buy a bag of balloons. When we count them up, we
find there are 18 red, 12 blue, 25 orange, 24 purple, and 21 green
balloons. What is the mode?
A). 18
B). 12
C). 24
D). 25
E). None of the above
Descriptive Statistics
 2). Measures of Central Tendency – The Mode Examples
A. The mode
– Let’s say we missed one, though. We pull out another purple
balloon from our bag, so we now have 18 red, 12 blue, 25
orange, 25 purple, and 21 green balloons
 What is our mode now?
Orange AND Purple
We now have a bimodal (two modes) distribution.
– Again, the mode works well with nominal data (data based
on categories rather than values)
Descriptive Statistics
 2). Measures of Central Tendency – The Median
B. The median is the central score (or the middle score, with
half above and half below) in an ordered distribution
– More information is taken into account than in the mode, and
the median is relatively insensitive to outliers

– The median is best used when …


 data are measured along an ordinal scale OR
 the data is interval but it does not meet the statistical
requirements needed to use the mean
Descriptive Statistics
 2). Measures of Central Tendency – The Median
B. The median is the central score (or the middle score, with
half above and half below) in an ordered distribution
– Consider our original balloon data (we have 18 red, 12 blue,
25 orange, 24 purple, and 21 green balloons)
 What is the median?
Some weird version of orangish/purple? Doesn’t
really work, does it? We’d have to go to the store and
buy a new weirdly colored balloon, but that doesn’t
help us understand out current bag of balloons!
Descriptive Statistics
 2). Measures of Central Tendency – The Median
B. The median is the central score (or the middle score, with
half above and half below) in an ordered distribution
– So let’s think of a different data set. Let’s say we teach kids
the ABCs with a new visualization approach (“The Anteater
tickles the Bear, the Bear tickles the Cheetah, the Cheetah
tickles the Duck”, etc.).
Descriptive Statistics
 2). Measures of Central Tendency – The Median
B. The median is the central score (or the middle score, with
half above and half below) in an ordered distribution

– On thirty trials, we ask them to name the letter that follows


one that we choose at random, and we grade them for
“correctness”.

– We have 10 students, with “correctness” ranging from 0


correct to 30 correct. We get the following data …
 The Median # Correct
Scores on the ABCs Quiz S1 23
S2 24
What is the Median for the S3 24
# Correct? S4 16
S5 18
S6 27
S7 22
S8 25
S9 26
S10 29
 The Median # Correct
Scores on the ABCs Quiz S4 16
S5 18
What is the Median for the S7 22
# Correct? Yeah, let’s put S1 23
them in order to make this S3 24
easier …
S2 24
S8 25
S9 26
S6 27
S10 29
The Median # Correct

{

Scores on the ABCs Quiz S4 16
S5 18
What is the Median for the S7 22
# Correct? Yeah, let’s put S1 23
them in order to make this S3 24
easier …
S2 24

{
S8 25
The median is 24
S9 26
– Half the scores fall
S6 27
above; half below
S10 29
The Median # Correct

{

Scores on the ABCs Quiz S4 16
S5 18
What is the Median for the S7 22
# Correct? Yeah, let’s put S1 23
them in order to make this S3 24
easier …
S2 24

{
S8 25
Now, let’s change one of
S9 26
those middle numbers …
S6 27
S10 29
 The Median # Correct
Scores on the ABCs Quiz S4 16
S5 18
Now what is the Median S7 22
for the # Correct? S1 23
S3 23
S2 24
S8 25
S9 26
S6 27
S10 29
The Median # Correct

{

Scores on the ABCs Quiz S4 16
S5 18
Now what is the Median S7 22
for the # Correct? S1 23
– Add middle numbers S3 23
 23 + 24 = 47 S2 24

{
– Divide by two S8 25
 47 / 2 = 23.5 S9 26
S6 27
S10 29
 The Median # Correct
Scores on the ABCs Quiz S4 16
S5 18
Okay, back to our original S7 22
data set (median of 24). S1 23
S3 24
Let’s add another student S2 24
this time … S8 25
S9 26
S6 27
S10 29
# Correct
 The Median S11 8
Scores on the ABCs Quiz S4 16
S5 18
Now, given an 11th student, S7 22
what is the Median for the #
S1 23
Correct =
S3 24
S2 24
S8 25
S9 26
S6 27
S10 29
# Correct

{
 The Median S11 8
Scores on the ABCs Quiz S4 16
S5 18
Now, given an 11th student, S7 22
what is the Median for the #
S1 23
Correct = 24
S3 24

{
S2 24
S8 25
S9 26
S6 27
S10 29
# Correct
 The Median S11 8
Scores on the ABCs Quiz
S4 16
S5 18
Let’s focus on that new kid
(S4), who just didn’t seem to S7 22
get it. S1 23
S3 24
Does his score of 8 impact our S2 24
distribution when it comes to
the median?
S8 25
S9 26
S6 27
S10 29
New Original
 The Median
S11 8 –
Scores on the ABCs Quiz
S4 16 16
Let’s focus on that new kid (S4), S5 18 18
who just didn’t seem to get it. S7 22 22
S1 23 23
Not really! The new median, like
the original, is still 24 S3 24 24
S2 24 24
Consider bigger differences S8 25 25
between two data sets …
S9 26 26
S6 27 27
S10 29 29
Varied Unvaried
 The Median S11 2 19
Scores on the ABCs Quiz S4 3 20
S5 3 21
These two data sets are S7 5 22
very different, but they still
S1 7 23
have the same median (24)
S3 24 24
Consider both data sets S2 29 25
from the perspective of the S8 30 26
mean (You remember the S9 30 27
mean, right?) S6 30 28
S10 30 29
Descriptive Statistics
 2). Measures of Central Tendency – The Mean
C. The mean is the average of all scores in a distribution
– This value is dependent on each score in a distribution
– The mean is the most widely used and informative measure
of center, and is often expressed as:
∑χ
M=
n

– Familiar formula, right!


Descriptive Statistics
 2). Measures of Central Tendency – The Mean
C. The mean is the average of all scores in a distribution
– This value is dependent on each score in a distribution
– The mean is the most widely used and informative measure
of center, and is often expressed as:
 M = Mean ∑χ
 ∑ = “Sum of” M=
n
 X = scores
 n = number of scores
Descriptive Statistics
 2). Measures of Central Tendency – The Mean
C. The mean is the average of all scores in a distribution
– The mean is best used when …
 data are measured along an interval or ratio scale
 scores are normally distributed (a normal curve)
 you need the most sensitive measure of center

– However, the mean is affected by outliers (really high or low


scores that don’t represent the rest of the distribution)!
New Original
 The Mean S11 2 19
Scores on the ABCs Quiz S4 3 20
S5 3 21
What is the mean for # S7 5 22
correct (new)?
S1 7 23
S3 24 24
S2 29 25
S8 30 26
S9 30 27
S6 30 28
S10 30 29
New Original
 The Mean S11 2+ 19
Scores on the ABCs Quiz S4 3+ 20
S5 3+ 21
What is the mean for # S7 5+ 22
correct (new)?
S1 7+ 23
S3 24 + 24
S2 29 + 25
193 S8 30 + 26
M=
11 S9 30 + 27
S6 30 + 28
M = 17.45
S10 30 + 29
New Original
 The Mean S11 2 19
Scores on the ABCs Quiz S4 3 20
S5 3 21
What is the mean for # S7 5 22
correct (original)?
S1 7 23
S3 24 24
S2 29 25
S8 30 26
S9 30 27
S6 30 28
S10 30 29
New Original
 The Mean S11 2 19 +
Scores on the ABCs Quiz S4 3 20 +
S5 3 21 +
What is the mean for # S7 5 22 +
correct (original)?
S1 7 23 +
S3 24 24 +
S2 29 25 +
264 S8 30 26 +
M=
11 S9 30 27 +
S6 30 28 +
M = 24.00
S10 30 29 +
Descriptive Statistics
 2). Measures of Central Tendency – The Mean

– The median for BOTH distributions was 24


However …
– The mean for the original data set was 24 while the mean for
the new data set is 17.45

– Quite a difference here. As you see, the mean is impacted


by EVERY number in the distribution while the median is
less sensitive to individual scores (and potential outliers!)
Descriptive Statistics
 2). Measures of Central Tendency – When should you use each?
Using the scales is a matter of choice – Two considerations
– 1). It may depend on the scale of measurement
 If you are using a nominal scale, you must use the mode
 If you are using an ordinal scale, mode or median
 For interval or ratio scales, the mean might be best
Descriptive Statistics
 2). Measures of Central Tendency – When should you use each?
Using the scales is a matter of choice – Two considerations
– 2) The shape of the distribution also matters
 It might be easiest to plot the data out and see the curve
 If you have a normal distribution, use the mean (it uses
the most info from the scores, so it is most preferable)

 For example, let’s say we look at the amount of time (in


years) that inmates serve, and we look at the number of
prisoners …
Histogram – 210 Responses
This is a nice, normal curve – The scores cluster in the center and dip
on the ends. You can run t-Tests & ANOVAs on this data
Prison Term Length
Descriptive Statistics
 2). Measures of Central Tendency – When should you use each?
Using the scales is a matter of choice – Two considerations
– 2) The shape of the distribution also matters

 But when we have non-normal or bimodal data, the


mean is not as helpful, as our distributions are skewed
Histogram Showing a Positive Skew

Non-normal data
140
120
100
80
60
40
20
0
None 1 Year 2 years 3 years 4 years 5 years
Histogram Showing a Negative Skew

Non-normal data
140
120
100
80
60
40
20
0
None 1 Year 2 years 3 years 4 years 5 years
Histogram Showing a Bimodal Pattern

Bimodal data
Descriptive Statistics
 2). Measures of Central Tendency – When should you use each?
Recall this example from last semester. Imagine a real estate
agent has the following sales (30 = 30,000)
– Set A: 30 40 50 60 70. Mean = 50 and median = 50
– Set B: 30 40 50 60 700. Mean = 176 and median = 50

– The value 700 is an outlier for Set B because it is a long way


from the next nearest data value, or 60.
– Here, the mean is not as helpful as the mode and median.
 As a home buyer, you may be leery of seeking the help
of an agent with a mean of $167,000, but not $50,000
Descriptive Statistics

Type of Variable Best measure of central


tendency to use
Nominal Mode
Ordinal Median
Interval / Ratio (not skewed) Mean
Interval / Ratio (skewed) Median
Pop Quiz: Pre-Class Quiz Question #9
 Last year, a fast food outlet in a beachside city paid 3 kitchen
hands $16,000 per year each, 2 supervisors $22,000 each, and
the owner $85,000. What is the mean?
A). $29,500
B). $19,000
C). $16,000
D). $14,500
E). $12,000
Pop Quiz: Pre-Class Quiz Question #10
 Last year, a fast food outlet in a beachside city paid 3 kitchen
hands $16,000 per year each, 2 supervisors $22,000 each, and
the owner $85,000. What is the mode?
A). $29,500
B). $19,000
C). $16,000
D). $14,500
E). $12,000
Pop Quiz: Pre-Class Quiz Question #11
 Last year, a fast food outlet in a beachside city paid 3 kitchen
hands $16,000 per year each, 2 supervisors $22,000 each, and
the owner $85,000. What is the median?
A). $29,500
B). $19,000
C). $16,000
D). $14,500
E). $12,000
Descriptive Statistics
 Pause-Problem #2 (Central Tendency)
For this Pause Problem, first tell me how to calculate the mean,
median, and mode. Second, tell me under what circumstances
you would want to use each

#2
Graphing Your Results?
 Why did I skip “Graphing Your Results”, even though it is covered
in Smith and Davis (Chapter 9)?

 We will return to graphing later this semester (our last week, in


fact), so although I recommend reading it now, we’ll talk about it
formally in week fifteen or so
Part Two

Measures of Variability
Measures of Variability
 Measures of Variability (or Measures of Spread)
In research, it is rare and unlikely that all participants produce
identical data. Even if two conditions give you similar means,
the distributions within those samples may vary a great deal

Who is the better real estate agent? Think about houses sold
– Ted’s sales: $100,000, $550,000, $75,000, $300,000, $110,000
– Bill’s sales: $200,000, $250,000, $225,000, $240,000, $220,000

– Each earned $1,135,000 total (mean of $227,000 for each)


Measures of Variability
 Measures of Variability (or Measures of Spread)
In research, it is rare and unlikely that all participants produce
identical data. Even if two conditions give you similar means,
the distributions within those samples may vary a great deal

Who is the better real estate agent? Think about houses sold
– Ted’s sales: $100,000, $550,000, $75,000, $300,000, $110,000
– Bill’s sales: $200,000, $250,000, $225,000, $240,000, $220,000

– You see that Ted’s sales have huge outliers (a very high and a
very low amount). It has more variability than Bill’s sales
Measures of Variability
 Calculating and Computing Statistics
While measures of center are easy, measures of variability (as
you saw last semester) are a bit more complicated to calculate

It’s time to remind you about the range, variance, and standard
deviation!
Measures of Variability
 Measures of Variability (or Measures of Spread)
Variability refers to the extent to which scores in a distribution
spread out around the mean
– When variability is small, the scores are clustered together
– When variability is large, there are big differences between
individuals, and scores are spread across a range of values
Measures of Variability
 Measures of Variability (or Measures of Spread)
Variability refers to the extent to which scores in a distribution
spread out around the mean

In this section, we will discuss four elements


– 1). The Range
– 2). Variance
– 3). Standard Deviation
– 4). The Semi-Interquartile Range
Measures of Variability
 1). The Range
To find the range, simply subtract the lowest score from the
highest score in your distribution of scores
– The simplest and least informative measure of spread
– Very sensitive to extreme scores (both high and low values)

– Think about kids and their ABC visualization study. What is


the range? …
 The range # Correct
Scores on the ABCs Quiz S4 16
S5 18
What is the Range? S7 22
S1 23
S3 24
S2 24
S8 25
S9 26
S6 27
S10 29
 The range # Correct
Scores on the ABCs Quiz S4 16
S5 18
What is the Range? S7 22
S1 23
29 – 16 = 13 S3 24
S2 24
S8 25
S9 26
S6 27
S10 29
Pop Quiz: Pre-Class Quiz Question #12
 What is the range in this table?
S1 23
S2 67
A). 93
S3 98
B). 83
S4 15
C). 72
S5 48
D). 55
S6 26
E). 33
S7 19
S8 22
S9 58
Measures of Variability
 1). The Range
As you saw last semester, there are lots of problems with using
just the range, as it doesn’t take into consideration the numbers
falling between the two most extreme scores.

Do you recall this graph …?


Problems with using just the range

Most scores close to the mean

Scores most spread out Scores more spread out

-5 Range +5 Range
Measures of Variability
 1). The Range
So, about that graph on the prior slide …
– All three curves (black, blue, and red) have the same range
(-5 to +5), but their distributions look different. This should
give us some pause about relying solely on the range …

– Instead, researchers focus on the variance and standard


deviation to describe the measure of spread
Measures of Variability
 2). Variance
Variance is a single number that represents the total amount of
variation. The higher the number, the greater the spread
– Recall from Methods One that variance is essentially the
squared deviation (or distance) of scores from the mean
 It is not expressed in the same units as original numbers
Measures of Variability
 2). Variance
Variance is a single number that represents the total amount of
variation. The higher the number, the greater the spread

– This measure of spread uses the formula below:

∑ (X – M)2
S2 =
n–1
Measures of Variability
 2). Variance
Variance is a single number that represents the total amount of
variation. The higher the number, the greater the spread

– However, you might also see it expressed THIS way

∑ (X – X)2
S2 =
n–1
Measures of Variability
 2). Variance
Variance is a single number that represents the total amount of
variation. The higher the number, the greater the spread

– And sometimes we use N, sometimes n – 1


 Do you recall how and why we use N versus n – 1?

∑ (X – X)2
S2 =
n–1
Pop Quiz: Pre-Class Quiz Question #13
 When do we use N and when do we use n – 1 for the standard
deviation calculation?
A). Use N for the sample and n – 1 for the population
B). Use N for the population and n – 1 for the sample
C). You can use either, as they result in the same outcome
D). You should actually use n + 1 over either of these options
E). Use N when your data is randomly assigned and n – 1 when
the data is not randomly assigned
Measures of Variability
 2). Variance
Variance is a single number that represents the total amount of
variation. The higher the number, the greater the spread

– And sometimes we use N, sometimes n – 1


 Do you recall how and why we use N versus n – 1?
 We use N for the population; n – 1 for the sample

∑ (X – X)2
S2 =
n–1
Measures of Variability
 2). Variance
Variance is a single number that represents the total amount of
variation. The higher the number, the greater the spread

Okay, let’s go through the calculation quickly (I know you did


this in Methods One, so it should be a quick review!)

Consider our ABC study (We’ll use the sample of 11 students


that we used before (we’ll use the “unvaried” sample, just for
the sake of simplicity!) …
# Correct
 Variance S11 19
Scores on the ABCs Quiz S4 20
S5 21
What is the variance?
S7 22
S1 23
S3 24
S2 25
S8 26
S9 27
S6 28
S10 29
Subject # X (X-M) (X-M)2
S11 19 19 – 24 = -5 25
S4 20 20 – 24 = -4 16
S5 21 21 – 24 = -3 9
S7 22 22 – 24 = -2 4
Our S1 23 23 – 24 = -1 1
original S3 24 24 – 24 = 0 0
scores S2 25 25 – 24 = 1 1
S8 26 26 – 24 = 2 4
S9 27 27 – 24 = 3 9
S6 28 28 – 24 = 4 16
S10 29 29 – 24 = 5 25
∑ 264 / 11 = 24 zero 110
s2 = (X-M)2 / n – 1
Subject # X (X-M) (X-M)2
S11 19 19 – 24 = -5 25
S4 20 20 – 24 = -4 16
S5 21 21 – 24 = -3 9
S7 22 22 – 24 = -2 4
Our S1 23 23 – 24 = -1 1
original S3 24 24 – 24 = 0 0
scores S2 25 25 – 24 = 1 1
S8 26 26 – 24 = 2 4
S9 27 27 – 24 = 3 9
S6 28 28 – 24 = 4 16
S10 29 29 – 24 = 5 25
∑ 264 / 11 = 24 zero 110
s2 = 110 / 10 = 11 (our variance)
Measures of Variability
 3). Standard Deviation
The standard deviation is the square root of the variance
– Unlike variance, the standard deviation is expressed in the
same unit as the original numbers (and is thus more useful)
 This is the most widely used measure of spread

SD = √ s2
SD = √ 11
SD = 3.32
Measures of Variability
 3). Standard Deviation
The standard deviation is the square root of the variance
– Unlike variance, the standard deviation is expressed in the
same unit as the original numbers (and is thus more useful)
 This is the most widely used measure of spread

– Use whenever the mean is the measure of central tendency


– The standard deviation is particularly important for the
normal curve (or normal distribution, or “bell curve”)
Measures of Variability
 3). Standard Deviation
A normal distribution is a symmetrical bell-shaped distribution
with half the scores above the mean and half below the mean
– A specific percent of scores are found within each standard
deviation of the mean. That is …
Measures of Variability
 3). Standard Deviation
Let’s see what you recall about the standard deviation from last
semester.

Time for some Pop Quizzes!


Pop Quiz: Pre-Class Quiz Question #14
 Imagine you have 100 people in your study, and the mean score on a
dependent variable is 50 / 100. How many people fall within + or – one
standard deviation of the mean?

A). 34.13 people


B). 68.26 people
C). 81.81 people
D). 95.44 people
E). 99.74 people
Pop Quiz: Pre-Class Quiz Question #15
 Imagine you have 100 people in your study, and the mean score
on a dependent variable is 50 / 100. We find a standard deviation
of 10. What two numbers represent the range for those scoring +
or – one standard deviation of the mean?

A). 30 to 40
B). 40 to 50
C). 40 to 60
D). 50 to 60
E). 50 to 70
Pop Quiz: Pre-Class Quiz Question #16
 Do you want the standard deviation to be a big number or a small
number?
A). Big
B). Small
C). It doesn’t really matter
D). None of the above
Measures of Variability
 Pause-Problem #3 (Outliers)
We saw earlier in this presentation (as well as in Methods One)
that high or low outliers can screw up the mean score. I want
you to come up with two possible ways that you can reduce the
impact of outliers
– This one is tough, so feel free to get creative!

#3
Measures of Variability
 4). The Semi-Interquartile Range
As you probably know, both the range and standard deviation are
sensitive to extreme scores
– In such cases, researchers may choose to look at the semi-
interquartile range instead. The S-IR is …
 less sensitive than the range to extreme scores
 used when you want a simple, rough estimate of spread

 (Yes, you can use this one as an answer to the prior


Pause-Problem)
Measures of Variability
 4). The Semi-Interquartile Range
For the semi-interquartile range, researchers group the data
into four equal parts
– Look at the score dividing the lowest scores (Quadrant 1
versus Quadrant 2) from the others and the highest score
(Quadrant 3 versus Quadrant 4) from the others, and then
find the difference

– To get four quadrants, let’s add a twelfth student


S11 19
 4). Semi-Interquartile Range S4 20
ABCs study S5 21
S7 22
What is the S-IR for the # S1 23
Number correct?
S3 24
S2 25
S8 26
S9 27
S6 28
S10 29
S12 30
S11 19
 4). Semi-Interquartile Range S4 20
ABCs study S5 21
S7 22
What is the S-IR for the # S1 23
Number correct?
S3 24
S2 25
First, let’s just color code
this in terms of quadrants S8 26
1, 2, 3, and 4 S9 27
S6 28
S10 29
S12 30
S11 19
 4). Semi-Interquartile Range S4 20
ABCs study S5 21
S7 22
What is the S-IR for the #
S1 23
Number correct?
S3 24
S2 25
Focus on S5 (the score
separating quadrants 4 S8 26
and 3) and S6 (the score S9 27
separating quadrants 1 S6 28
and 2)
S10 29
S12 30
S11 19
 4). Semi-Interquartile Range S4 20
ABCs study
S5 21
S7 22
What is the S-IR for the #
Number correct? S1 23
S3 24
S5 = 21 S2 25
S6 = 28 S8 26
28 – 21 = 7 S9 27
• 7 is our semi-interquartile S6 28
range!
S10 29
S12 30
Measures of Variability
 4). The Semi-Interquartile Range
The interquartile range essentially eliminates really high or low
outliers, hopefully eliminating skewness in the distribution. It also
provides us with a quick overview of the distribution, often
expressed using a 5 Number Summary that includes …
– Maximum score
– The third quartile
– The median (middle value)
– The first quartile
– Minimum score
Example of a Five Number Summary
Maximum 30

Third Quartile (Q3) 28

Median (Middle) 24.5

First Quartile (Q1) 21

Minimum 19
Measures of Variability
 4). The Semi-Interquartile Range and The Boxplot
Some of you may have some across a Boxplot before. This is a
graphic representation of the five number summary
– First and third quartile numbers define the ends of the box
– A line inside the box represents the median
– Vertical “whiskers” extending above and below the box and
represent the maximum and minimum scores (respectively)

– For example …
Example of a Boxplot
Example of a Boxplot

First and third quartiles


Example of a Boxplot

Median

First and third quartiles


Example of a Boxplot
Maximum Score

Median

First and third quartiles

Minimum Score
Measures of Variability
 4). The Semi-Interquartile Range and The Boxplot
Some of you may have some across a Boxplot before. This is a
graphic representation of the five number summary
– Here is data from multiple treatments, represented by side-
by-side boxplots
Example of a Boxplot (left) and
Side-By-Side Boxplot (right)
A Prior Quiz - Five Number Summary
Maximum 20

Third Quartile (Q3) 17

Median (Middle)) 15

First Quartile (Q1) 14

Minimum 9
Box Plot – A Prior Methods Two Quiz!
Pop Quiz: Pre-Class Quiz Question #17
 In this boxplot, what is the median?

A). 150
B). 350
C). 420
D). 490
E). 920
Pop Quiz: Pre-Class Quiz Question #18
 In this boxplot, what is the total range?

A). 140
B). 300
C). 650
D). 770
E). 1000
Pop Quiz: Pre-Class Quiz Question #19
 In this boxplot, what is the semi-interquartile range?

A). 140
B). 300
C). 650
D). 770
E). 1000
Pop Quiz: Pre-Class Quiz Question #19
 In this boxplot, what is the semi-interquartile range?

A). 140
B). 300
C). 650
D). 770 Hint!
E). 1000
Part Three
Correlations (Brief)
Correlation
 I know Smith and Davis talk about correlations in Chapter 9, but
you already know a lot about correlations (
Reread Chapter 5 in Salkind “Ice-Cream and Crime” if you
need a quick reminder!)
No test questions on correlations for this chapter, but we will
return to correlations later this semester when we look at
Salkind Chapter 16
Part Four
Inferential Statistics
Inferential Statistics
 Inferential Statistics
Thus far, we have talked about ways to describe data (through
measures or spread and center) using descriptive statistics

Inferential statistics, in contrast, go beyond mere description


and focus on whether two groups differ significantly
– Inferential statistics let you infer characteristics of a larger
population based on the characteristics of a sample
– Here, we use reliability estimates rather than frequencies
Inferential Statistics
 In this section, we will focus on
1). Significance testing
2). The t-Test for independent groups (a brief reminder!)
3). One-tailed versus two-tailed t-Tests
4). The logic of significant testing
5). When statistics go awry – Type I and Type II Errors
6). Effect size
Inferential Statistics
 1). Significance Testing
In essence, significance refers to a statistical outcome that is
not likely to have occurred by chance
– That is, if it occurs by chance, we can conclude that the IV
did not affect the DV
 We accept the “null hypothesis” – that the differences
between groups is due to chance
– If not by chance, we accept the alternative hypothesis, that
something other than chance brought about the results
 Given our control in our study, the IV is the best culprit!
Inferential Statistics
 1). Significance Testing
As you should recall from Methods One, psychologists use a p
value set to less than .05 for significance
– That is, any event that occurs by chance alone 5 times or
less in 100 occasions is “rare”. In essence, we “allow” up to
5% error in our study and still conclude that it is significant.
 Psychologists thus deem this “rare” 5% or less
occurrence as acceptable for significance
Inferential Statistics
 1). Significance Testing
As an experimenter, you decide what cutoff point to use:
– p levels of .05 and .01 are common in psychology, but …
 some researchers say there is a significant trend when p
= .10
 Important update! Since statistical software allows us to
get even more specific nowadays, and report the exact p
value!
p = .034 p = .002
Inferential Statistics
 1). Significance Testing – Two Possibilities Exist:
A. Null hypothesis (H0): This implies that the mean scores from
the treatment and control groups were drawn from the same
population, or (µt = µc)
– If we support the null hypothesis, differences between the
different study groups are due to chance (and not to the IV)

B. Alternative hypothesis (H1): The mean scores were drawn


from different populations (Treatment ≠ Control), or (Xt ≠ Xc)
Inferential Statistics
 1). Significance Testing
Inferential statistics help determine if the difference between
the populations is significant
– If you reject the null hypothesis, you are saying that your
treatment had an effect
– If you retain the null hypothesis, you are saying that your
treatment did not have an effect

Time to consider significance testing in light of the t-Test


Inferential Statistics
 2). The t-Test
We use a t-Test in studies that have one independent variable
with two levels and one dependent variable. We compare the
two group means to see if they differ significantly
– As you saw in Salkind (Chapter 11: Tea for Two) we can
look at two groups and compare them using an independent
samples t-Test design
 Smith and Davis similarly refer to the independent
samples t-Test (which we will cover in detail in Chapter
10 in a few weeks), but let’s take a brief look at it now
Inferential Statistics
 2). The t-Test
To make this lecture
somewhat easier, I am
going to follow a study
your textbook talks
about, so make sure
you read over chapter
nine on inferential
statistics, particularly
the study on sloppy
Sloppy vs. Dressy
versus dressy clothes
Inferential Statistics
 2). The t-Test
Imagine you design a study to see if salesclerks are quicker to
help customers in dressy clothes compared to shabby clothes
– You enlist a confederate to dress shabby in some cases and
dressy in others (your IV is the manner of dress)
– Using random assignment to determine state of dress, she
approaches 8 clerks dressed sloppily and 8 dressed dressy
 8 participants per condition, thus N = 16
– Your dependent variable is how long (in seconds) it takes for a
salesperson to approach and help the customer
Your Data for this Study – In Seconds
Group A – Dressy Clothes Group B – Sloppy Clothes
37 50
38 46
44 62
47 52
49 74
49 39
54 77
69 76
∑χ = 387 ∑Y = 506
Mean = 48.38 (seconds) Mean = 63.25 (seconds)
Inferential Statistics
 2). The t-Test
Remember this formula?

X1 is the mean for group 1


X2 is the mean for group 2
n1 is the number of participants in Group 1
n22 is the number of participants in Group 2
1
s is the variance for Group 1
2
2
s is the variance for Group 2
Inferential Statistics
 2). The t-Test
We are not going to go through these calculations by hand (we
already did that in Methods One!), but we find:

Don’t worry about our negative t-Test value. The decision to go


with “sloppy minus dressy” or “dressy minus sloppy” is arbitrary
We need one more piece of information to see if groups differ,
the degrees of freedom (or n – 1 for each of our two samples)
Inferential Statistics
 2). The t-Test
df = (N1 – 1) + (N2 – 1)
(8 – 1) + (8 –1)
df = 14 7 + 7
t = -2.61

Consider Appendix A, Table A (in your Smith and Davis book). Is it


significant? Let’s find out. For now, just consider alpha (or α) levels
for a two-tailed t-Test …
TABLE A-1 The t
Distribution*

Recall our t value = -2.61, df = 14


Inferential Statistics
 2). The t-Test
df = 14 t = -2.61
 Is it significant?
Look for the df = 14 column for the 2 tailed t-Test, and scroll
across looking at the t-values. As you can see, it is significant at
the .05 level (2.145) but not quite significant at the .02 level
(2.624). Thus we will stick with the .05 level
TABLE A-1 The t
Distribution*

Recall our t value = -2.61, df = 14


Inferential Statistics
 2). The t-Test
df = 14 t = -2.61
 Is it significant?
So we conclude that salespeople take longer to help people
dressed in shabby clothes (M = 63.25 sec) than people dressed
in dressy clothes (M = 48.38 sec), t(14) = 2.61, p < .05

So why did we use a two-tailed t-Test rather than a one tailed t-


Test? Recall the non-directional vs. directional test differences
Inferential Statistics
 3). One vs. Two Tail Tests of Significance
When looking at significance, remember we must determine if
we are looking at a directional hypothesis vs. a non-directional
hypothesis
– Directional hypotheses focus on one specific outcome
 Group A will perform better than Group B OR
 Group A will perform worse than Group B
– Non-directional hypotheses allow for any outcome
 Group A may perform better OR worse than Group B
Inferential Statistics
 3). One vs. Two Tail Tests of Significance
In research, one-tailed tests assess directional hypotheses
– Remember, we “allow” up to 5% error in our study (and still
call it significant, p < .05).
– In one-tailed tests, all error occurs at one end of the curve
Inferential Statistics
 3). One vs. Two Tail Tests of Significance
Inferential Statistics
 3). One vs. Two Tail Tests of Significance
It is often easier to find significance with one-tailed tests than
two-tailed tests since you merely want to see if one “treatment”
does better than another (with no care whether it does worse)
– If your treatment did worse, you wouldn’t use that treatment
anyway, so why test to see if it does worse?
– “If students take a new GRE course, they will score higher
than the national average on the GRE” is an unidirectional
hypothesis. This means that the chances of getting such a
high score by chance alone is less than 5 in 100 (p < .05)
Inferential Statistics
 3). One vs. Two Tail Tests of Significance
Sometimes you want to conduct a two-tailed test to determine if
a treatment did better OR did worse than an alternative
– Two-tailed tests also allows up to 5% error, but the error is
split between both ends of the curve (2.5% and 2.5%)
Inferential Statistics
 3). One vs. Two Tail Tests of Significance
Inferential Statistics
 3). One vs. Two Tail Tests of Significance
It is harder to find significance using non-directional, two-tailed
test, as the “critical region” for finding significance is 2.5% at the
top end of the curve and 2.5% at the lower end of the curve
– “If students take an GRE course, they may score better OR
worse than the national average” is an example of this kind
of bidirectional hypothesis
 If students scored worse with the new course, then you
would probably recommend not using the new course!
 If they scored better, you would recommend the course
TABLE A-1 The t
Distribution*

A t-value of 1.80 is significant for a one tailed but not a two-tailed test
Inferential Statistics
 3). One vs. Two Tail Tests of Significance
So, if you have a better shot at finding significance with a one-
tailed test, why bother with a two-tailed test?
– One-tailed tests give up a lot of valuable information about
the untested “other” end of the curve.
– If your hypothesis is that the new GRE program does better
than an alternative program, you cannot later conclude that
it did worse than the alternative program (all you can say is
that it did no better than the alternative)
 Thus you reject the alternative hypothesis and retain the
null hypothesis
Inferential Statistics
 3). One vs. Two Tail Tests of Significance
So, if you have a better shot at finding significance with a one-
tailed test, why bother with a two-tailed test?
– Often, psychologists do not know what the outcome will be
(which is why they did the study!), so a two-tailed test lets
them fully test all potential study outcomes
 A clinician testing a new ADHD therapy, for example,
may want to know if the new therapy helps the patient
while also knowing if it harms the patient. She would thus
need to run a two-tailed test
Inferential Statistics
 3). One vs. Two Tail Tests of Significance
Choosing a test
– It is highly recommended that researchers pick which test
they will be using BEFORE data collection begins
 “Fraud” issues

– Use the more conservative two-tailed test unless you have a


really good reason to use a one-tailed test
 If a new GRE course is very expensive, you may only be
interested if it beats the other class. A one-tailed is fine
Inferential Statistics
 4). The Logic of Significance Testing
Imagine you take the GREs twice. Will the scores be identical?
– You are the same person, so the “population” is the same. Yet your
scores might fluctuate a little bit, just by chance.
 At Time 1, you could get 1140 ; At Time 2, you get 1160
 The differences are most likely due to chance (error)

– But what if Time 1 you got 1140 and Time 2 you got 1340?
 Something other than chance is probably going on here, right?
Like what?
Inferential Statistics
 4). The Logic of Significance Testing
The logic behind significance testing is that we want to see if
differences between our control group and our experimental group are
due to chance or due to the addition of our IV

– We first must determine if the samples we drew from the population


are similar to one another (with only “natural” fluctuations)

– That is …
Inferential Statistics
 4). The Logic of Significance Testing
You can tell if something is statistically significant by looking at
the probability that two different samples drawn from the same
population would differ by the observed amounts (means)
– If the chance that they would differ is very small (that is, it is
unlikely that two samples this different could be drawn from
the same population), then you have statistical significance
– To put it another way, two groups drawn from the same
population will act a lot alike, but there will be some natural
variation. However, if you give a treatment to only one of the
groups, variation will emerge beyond “natural” fluctuations
Inferential Statistics
 4). The Logic of Significance Testing
Suppose we had a “population” of Homer Simpsons
Inferential Statistics
 4). The Logic of Significance Testing
If we drew two Homer Simpsons at random, they should look a
lot alike (although probably not identical – remember, there will
be some natural variation among population members, so the
sample members will also have some natural variation). Or …
Inferential Statistics
 4). The Logic of Significance Testing
The Population

The Sample
– Sample “features” fall within the
range of population “features”
– Any differences in the sample
members (1 & 2) are by chance
Inferential Statistics
 4). The Logic of Significance Testing
The Population

The Sample
– Sample “features” fall within the
range of population “features”
– Any differences in the sample
members occur by chance
Inferential Statistics
 4). The Logic of Significance Testing

Now …

The Sample
– Expose them to the IV (control
vs. treatment groups)
– The IV should create different
outcomes based on IV features
Inferential Statistics
 4). The Logic of Significance Testing
The Population
– Population should reflect the
same differences IF exposed
to the IV as well
The Sample
– Expose them to the IV (control
vs. treatment groups)
– The IV should create different
outcomes based on IV features
Inferential Statistics
 4). The Logic of Significance Testing
The point here is that we to try to
generalize from the sample back
to the population (increasing our
external validity)

Unfortunately, though, there are


some errors that may crop up in
significance testing …
Inferential Statistics
 5). When Statistics go Astray: Type I and Type II Errors
Conclusions with Statistics
– A. Sometimes the null hypothesis is true (your samples do
not differ significantly, thus your experiment failed) so you
retain the null hypothesis
 This can be very frustrating, especially if you want to
show support for your alternative hypothesis
Inferential Statistics
 5). When Statistics go Astray: Type I and Type II Errors
Conclusions with Statistics
– B. Sometimes the null hypothesis is false (the samples do
differ, and you succeeded!), and you can reject the null
hypothesis
 This is what we usually want in research,
letting us do our happy dance!
Inferential Statistics
 5). When Statistics go Astray: Type I and Type II Errors
Conclusions with Statistics
– C. Our third type of conclusion is problematic. Sometimes
you make a mistake in your result interpretation, and either
conclude that 1) an effect occurred when it really did not OR
2) did not occur when it really did
Inferential Statistics
 5). When Statistics go Astray: Type I and Type II Errors
This gives us two correct conclusions
– We reject the null hypothesis when we should reject
– We retain the null hypothesis when we should retain

It also gives us two incorrect conclusions


– We reject the null hypothesis when we should retain
 Type I Error (a “false-alarm”)
– We retain the null hypothesis when we should reject
 Type II Error (a “miss”)
Inferential Statistics
 5). When Statistics go Astray: Type I and Type II Errors

Ideally, you should try to limit both errors. But realistically,


decreasing one may actually increase the other
Inferential Statistics
 5). When Statistics go Astray: Type I and Type II Errors
We use our alpha value set at .05 because it places these
errors within acceptable levels
– We don’t demand a pure outcome by saying that we must
have 100% showing our alternative hypothesis is correct
 We avoid a Type II Error in saying the null hypothesis is
true (i.e. that our research failed) when it is not true
– We do not say there is significance when there really isn’t
 Allowing only 5% error decreases our chances of saying
that non-significant results are significant when they are
not actually significant (which would be a Type I Error)
Pop Quiz: Pre-Class Quiz Question #20
 I think pop quizzes improve student comprehension, so I include
them in powerpoints. I find they aren’t working out very well during
lectures, where I expected significant improvement, so I
discontinue them. However, pop quizzes actually help students,
just not in class lectures, and not to the point that I expected. What
kind of error have I made in dropping the pop quizzes?
A). No Error
B). Type I Error
C). Type II Error
Inferential Statistics
 6). Effect Size
Statistical testing only tells us that there was an effect. It does
not really tell us much about the magnitude of that effect.
– Many journals are thus now focusing on the effect size – the
magnitude or size of the experimental treatment

– The effect size is becoming just as important (if not more so)
as the significance testing
Inferential Statistics
 6). Effect Size
Statistical testing only tells us that there was an effect. It does
not really tell us much about the magnitude of that effect.
– Effect sizes can be calculated using Cohen’s d (for two
sample t-tests) or Pearson’s r

– Of course, you learned all about this in Methods One, so


reread Salkind (Chapter 11: Tea For Two) for a reminder!
Part Five

An Eye Toward The Future


An Eye Toward The Future
 Here is your last Pause-Problem #4 (Pop Quiz) #4
Yup, this slide again!

For your last Pause-Problem, I want YOU to write a multiple


choice pop-quiz question based on the content of this chapter. I
might use your question on a future pop quiz or actual course
exam (though not this semester), so make it good! Make sure
to include your correct answer and up to five possible answers!
An Eye Toward The Future
 An Eye Toward The Future
This chapter gave you a quick review of material you (should
have) learned about last semester in Research Methods and
Design One, but pay attention to the new information presented
here, like the box plots and NOIR

Next up, we are going to head into Smith and Davis Chapter
10, where we will go into a bit more detail about the t-Test (both
dependent and independent).

You might also like