You are on page 1of 10

Course Name: Math I

Unit # 3

Unit Title: Descriptive Statistics

Enduring understanding (Big Idea): The students will understand that there are various representations to
organize, model and analyze data to best fit real world situations based upon its characteristics.
Essential Questions:
1. How can collecting and analyzing data help you make decisions or predictions?
2. How can you make and interpret different representations of data?
3. How can you use graphical representations to predict outcomes and make judgments?
4. How are residuals used to analyze the goodness of fit?
BY THE END OF THIS UNIT:

Students will be able to


S.ID.1:
I can represent data with dot plots.
I can represent data with histograms.
I can represent data with box plots.
S.ID.2:
I can use statistics appropriate to the shape of the data distribution to
describe a data set.
I can compare center (median, mean) and spread (interquartile range,
standard deviation) of two or more different data sets.
S.ID.3:
I can discuss the effects of extreme outliers on the shape, center and
spread of a data distribution.
S.ID.5
I can organize categorical data for two categories in two-way
frequency tables.
I can interpret joint frequencies in the context of the data.
I can interpret marginal frequencies in the context of the data.
I can interpret conditional frequencies in the context of the data.
I can recognize possible associations and trends in data in a two-way
frequency table.
S.ID.6a:
I can represent data on a scatter plot.
I can describe the relationship between two variables in a scatter plot.
I can create a function (linear, quadratic, or exponential) that models
data on a scatter plot.
I can use a function to solve problems in the context of the data.
S.ID.6b:
I can calculate and plot residuals.
I can assess and analyze the fit of a function by using residuals.
S.ID.6c:
I can recognize when a data set suggests a linear association.
I can create a linear function that models data on a scatterplot.
S.ID.7:
I can interpret the slope (rate of change) of a linear model in the
context of the data.
I can interpret the intercept (constant term) of a linear model in the
context of the data.
S.ID.8:
I can compute (using technology) the correlation coefficient of a linear
fit.
I can interpret the correlation coefficient of a linear fit.
S.ID.9:
I can distinguish between correlation and causation.

Students will know

Vocabulary:
Quartile
Scatter plot
Correlation Coefficient
Rate of Change
Box Plots
Joint Frequencies
Conditional Relative Frequencies
Correlation
Intercepts
Measures of Central Tendency
Measures of Dispersion

Outlier
Histogram
Causation
Line of Best Fit
Goodness of Fit
Marginal Frequencies
Constant

Unit Resources
Chapter 12 project: Music To My Ears
Chapter 12 project (Algebra II textbook): Munching
Microbes
Mathematical Practices in Focus:
1. Make sense of problems and persevere in solving them
2. Reason abstractly and quantitatively
3. Construct viable arguments and critique the reasoning of
others
4. Model with mathematics
Released Test Questions: 29,30,31,47,48,49,50
CCSS-M Included:
S.ID.1, S.ID.2, S.ID.3, S.ID.5, S.ID.6, S.ID.7, S.ID.8, S.ID.9
Suggested Pacing:
10 days
Algebra I Project Binder: p. 377-421

Standards are listed in alphabetical /numerical order not suggested teaching order.
PLCs must order the standards to form a reasonable unit for instructional purposes.

Course Name: Math I

Unit # 3

Unit Title: Descriptive Statistics

CORE CONTENT
Cluster Title: Summarize, represent, and interpret data on a single count or measurement variable
Standard S.ID.1: Represent data with plots on the real number line (dot plots, histograms, and box plots)
Concepts and Skills to Master:
Graph numerical data on a real number line using dot plots, histograms, and box plots
Describe and give a simple interpretation of a graphical representation of data
Determine which type of data plot would be most appropriate for a specific situation
SUPPORTS FOR TEACHERS
Critical Background Knowledge
Know how to compute a median
Find the lower extreme (minimum), upper extreme (maximum), and quartiles
Academic Vocabulary
Dot plot, histogram, box plot, quartiles, lower extreme (minimum), upper extreme (maximum), median, outlier
Suggested Instructional Strategies:
Resources:

Gather or provide data and have students plot


each type of graph
Analyze the strengths and weaknesses inherent in
each type of plot by comparing different plots of the
same data
Have students collect their own data and choose a
graph to represent it.

Textbook Correlation: 12-2, 12-4

MARS Concept Development Lesson: (S.ID.1 through


S.ID.4)
Representing Data Using Frequency Graphs

MARS Concept Development Lesson: (S.ID.1 through


S.ID.4)
Representing Data Using Box Plots

NCDPI Unpacking:
S.ID.1 Construct dot plots, histograms and box plots for
data on a real number line.
Sample Assessment Tasks
Skill-based task

Problem Task

1.The following data set shows the number of songs downloaded


in one week by each student in Mrs. Jones class: 10, 20, 12, 14,
12, 27, 88, 2, 7, 30, 16, 16, 32, 15, 25, 15, 4, 0, 15, 6.

On the midterm math exam, students had the following scores:


95, 45, 37, 82, 90, 100, 91, 78, 67, 84, 85, 85, 82, 91, 92, 93, 92,
73, 84, 100, 59, 92, 77, 68, 88.

Choose and create a plot to represent the data

What are the strengths and weaknesses of presenting this data in


a certain type of plot for:
Students in a class?
Parents?
The school board?

2.Create a frequency distribution table and histogram for the


following set of data:

Age (in months) of First Steps


13
9
12
11
10
8.5
14
9
12.5
10
13.5
9.5
6
7.5
15
9
8
11.5
10
12
10.5
11
13
12.5

Standards are listed in alphabetical /numerical order not suggested teaching order.
PLCs must order the standards to form a reasonable unit for instructional purposes.

Data sets: http://www.freestatistics.info

Algebra I Project Binder: 377-394

Course Name: Math I

Unit # 3

Unit Title: Descriptive Statistics

CORE CONTENT
Cluster Title: Summarize, represent, and interpret data on a single count or measurement variable
Standard S.ID.2: Use statistics appropriate to the shape of the data distribution to compare center (median, mean) and spread
(interquartile range, standard deviation) of two or more different data sets.
Concepts and Skills to Master:
Given two sets of data or two graphs, identify similarities and differences in shape, center and spread.
Compare data sets and be able to summarize the similarities and differences between the shape, and measures of centers and
spreads of the data sets.

SUPPORTS FOR TEACHERS


Critical Background Knowledge
Know how to compute the mean, median, interquartile range, and standard deviation by hand in simple cases and using technology with
larger data sets.
Create a graphical representation of a data set
Academic Vocabulary
Mean, median, interquartiles range, standard deviation, center, spread, shape
Suggested Instructional Strategies:
Resources:
Use technology to manipulate plots of data sets to explore how changing
data affects the measures of center and spread.
Textbook Correlation: 12-3, CC-19,
Discuss what it means when related data sets have differing centers of
CB12-3, 12-4
spreads in relation to the context.
NCDPI Unpacking:
S-ID.2 Understand which measure of center and which measure of spread is
most appropriate to describe a givendata set. The mean and standard deviation
are most commonly used to describe sets of data. However, if the distribution is
extremely skewed and/or has outliers, it is best to use the median and the
interquartile range to describe the distribution since these measures are not
sensitive to outliers. The work in this standard builds upon the following middle
school concepts: 7.SP.3,4.

Sample Assessment Tasks


Skill-based task
1.The boxplots show the distribution of
scores on a district writing test in two
fifth grade classes at a school.
Compare the range and Medians of
the scores from the two classes.

Data sets: http://www.freestatistics.info

Algebra I Project Binder: 395-408

Problem Task
1.Plot data based on populations of European countries. Plot
data based on populations of Asian countries. Compare and
discuss differences in center and spread.

2.You are planning to take on a part time job as a waiter at a local restaurant.
During your interview, the boss told you that their best waitress, Jenni, made
an average of $70 a night in tips last week. However, when you asked Jenni
about this, she said that she made an average of only $50 per night last
week. She provides you with a copy of her nightly tip amounts from last
week (see below).
a. Calculate the mean and the median tip amount.
b. Which value is Jennis boss using to describe the average tip? Why do
you think he chose this value?
c. Which value is Jenni using? Why do you think she chose this value?
d. Which value best describes the typical amount of tips per night? Explain
why.
Day
Tip Amount
Sunday
Monday
Wednesday
Friday
Saturday

$50
$45
$48
$125
$85

Standards are listed in alphabetical /numerical order not suggested teaching order.
PLCs must order the standards to form a reasonable unit for instructional purposes.

2.Delia wanted to find the best type of fertilizer for her tomato
plants. She purchased three types of fertilizer and used each
on a set of seedlings. After 10 days, she measured the
heights (in cm) of each set of seedlings. The data she
collected is shown below. Construct box plots to analyze the
data. Write a brief description comparing the three types of
fertilizer. Which fertilizer do you recommend that Delia use?
7.1
5.0
3.2
5.5
6.2

Fertilizer A
6.3
1.0
4.5
5.2
4.6
2.4
3.8
1.5
6.9
2.6

Fertilizer B
11.0 9.2
5.6
8.4
7.2 12.1
10.5 14.0 15.3
6.3
8.7 11.3
17.0 13.5 14.2

Fertilizer C
10.5 11.8 15.5
14.7 11.0 10.8
13.9 12.7 9.9
10.3 10.1 15.8
9.5 13.2 9.7

Course Name: Math I

Unit # 3

Unit Title: Descriptive Statistics

CORE CONTENT
Cluster Title: Summarize, represent, and interpret data on a single count or measurement variable
Standard S.ID.3: Interpret differences in shape, center and spread in the context of the data sets, accounting for possible effects of
extreme data points (outliers).
Concepts and Skills to Master:
Given two sets of data or two graphs, identify similarities and differences in shape, center and spread.
Interpret similarities and differences between the shape and measure of centers and spread of data sets.
State the effects of any existing outliers.

SUPPORTS FOR TEACHERS

Critical Background Knowledge


Know how to compute the mean, median, interquartile range, and standard deviation by hand in simple cases and using
technology with larger data sets.
Create a graphical representation of a data set.
Academic Vocabulary
Extreme data point (outliers), skewed, center, spread

Suggested Instructional Strategies:

Resources:

Use data from multiple sources to interpret differences in shape, center and spread
Use data that includes outliers and explore what happens when outliers are
removed.
Discuss the effect of outliers on measures of center and spread and the effect on
the shape

Textbook Correlation: 12-3

Data sets:
http://www.freestatistics.info

Algebra I Project Binder: 409-421

NCDPI Unpacking:

S-ID.3 Understand and be able to use the context of the data to explain why its
distribution takes on a particular shape (e.g. are there real-life limits to the values of the
data that force skewness? are there outliers?)
S-ID.3 Understand that the higher the value of a measure of variability, the more spread
out the data set is.
S-ID.3 Explain the effect of any outliers on the shape, center, and spread of the data sets.

Sample Assessment Tasks


Skill-based task

Problem Task

1.The boxplots show the distribution of scores


on a district writing test in two fifth grade
classes at a school. Which class performed
better and why?

1.Find two similar data sets A and B (use


textbook or internet resources). What changes
would need to be made to data set A to make it
look like the graph of set B?

2.The table below shows the daily attendance at two movie theaters for 5 days.
Calculate the mean and median for each and answer the questions below.
Carmike Cinemas
IMAX Theater 2
Day 1
100
72
Day 2
87
97
Day 3
90
70
Day 4
10
71
Day 5
91
100
Mean
Median
Which statistic, the mean or the median, would you use to describe the typical daily
attendance for the 5 days at Carmike Cinemas? Justify your answer.

3.On last weeks math test, Mrs. Smiths class had an average of 83 points with a
standard deviation of 8 points. Mr. Tuckers class had an average of 78 points with a
standard deviation of 4 points. Which class was more consistent with their test
scores? How do you know?

Standards are listed in alphabetical /numerical order not suggested teaching order.
PLCs must order the standards to form a reasonable unit for instructional purposes.

2.Create a data set based on test scores that


illustrates the following
a. A skewed left distribution.
b. A skewed right distribution
c. A symmetrical distribution

Course Name: Math I

Unit # 3

Unit Title: Descriptive Statistics

CORE CONTENT
Cluster Title: Summarize, represent, and interpret data on two categorical and quantitative variables.
Standard S.ID.5: Summarize categorical data for two categories in two-way frequency tables. Interpret relative frequencies in the
context of the data (including joint, marginal, and conditional relative frequencies). Recognize possible associations and trends in the
data.
Concepts and Skills to Master:
Create a two-way frequency table showing the relationship between two categorical variables.
Find and interpret joint, marginal and conditional relative frequencies.
Analyze possible associations and trends in the data.

SUPPORTS FOR TEACHERS

Critical Background Knowledge


Present data in a frequency table
Academic Vocabulary
Categorical data, two-way frequency table, relative frequency, joint frequency, marginal frequency, conditional relative frequencies,
trends.

Suggested Instructional Strategies:

Resources:

Use contextual situations to have students create a two-way frequency table showing the
relationship between two categorical variables such as height and weight or blood
pressure and incidence of heart disease.
Use technology to create two-way tables.
Compare various tables and discuss frequencies that are evident.

NCDPI Unpacking:
S-ID.5 Create a two-way frequency table from a set of data on two categorical variables.
S-ID.5 Calculate joint, marginal, and conditional relative frequencies and interpret in
context. Joint relative frequencies are compound probabilities of using AND to combine
one possible outcome of each categorical variable (P(A and B)). Marginal relative
frequencies are the probabilities for the outcomes of one of the two categorical variables in
a two-way table, without considering the other variable. Conditional relative frequencies
are the probabilities of one particular outcome of a categorical variable occurring, given
that one particular outcome of the other categorical variable has already occurred.
Sample Assessment Tasks
Skill-based task
1.What is the joint frequency of students who have chores and a
curfew? Which marginal frequency is the largest?
Curfew: YES
Curfew: NO
Total

Chores:YES
Chores: NO
Total

13
12
25

5
3
8

Male
Female

a.
b.
c.
d.

Soccer
30
32

Textbook Correlation: CC-20

Data sets:
http://www.freestatistics.info

Problem Task
1.Collect data that compares populations of countries with square
miles. What trends emerge when we compare living in geographically
large countries which those that are highly populated?

18
15

2.Use the frequency table below to answer the following questions.


Youth Soccer League
Age Group

2.Laura collected information about the type of sports that students


in her school were involved in. She recorded her information in
the frequency table below.
Basketball
25
23

Tennis
15
20

Gender
M
F
Total

No Sports
45
40

3-5
yrs
old

6-8
yrs
old

9-11
yrs
old

4
1
5

3
4
7

3
3
6

1214
yrs
old
5
4
9

1517
yrs
old
5
3
8

Total
20
15
35

a.) What is the relative frequency of players who are male and 9-11
years old? (joint relative frequency)
b.) What is percentage of female players that are 15-17 years old?
(conditional relative frequency)
c.) What percentage of league members are male? (marginal relative
frequency)

What percentage of females play tennis?


What percentage of males are not involved in any sport?
What percentage of students play soccer?
What percentage of students are male and play soccer?

Standards are listed in alphabetical /numerical order not suggested teaching order.
PLCs must order the standards to form a reasonable unit for instructional purposes.

Course Name: Math I

Unit # 3

Unit Title: Descriptive Statistics

CORE CONTENT
Cluster Title: Summarize, represent, and interpret data on two categorical and quantitative variables.
Standard S.ID.6: Represent data on two quantitative variables on a scatter plot, and describe how the variables are
related.
a. Fit a function to the data; use functions fitted to data to solve problems in the context of the data. Use given
functions or choose a function suggested by the context. Emphasize linear and exponential models.
b. Informally assess the fit of a function by plotting and analyzing residuals.
c. Fit a linear function for scatter plots that suggest a linear association.
Concepts and Skills to Master:
Create a scatter plot of bivariate data and estimate a linear or exponential function that fits the data and use this
function to solve problems in the context of the data
Find residuals using technology and analyze their meaning.
Fit a linear function (trend line) to a scatter plot with and without technology
SUPPORTS FOR TEACHERS
Critical Background Knowledge
Plot data on a coordinate grid and graph a linear function
Recognize characteristics of linear and exponential functions
Write an equation of a line given two points.
Academic Vocabulary
Function, linear model, exponential model, bivariate, residuals, scatter plot, correlation
Suggested Instructional Strategies:
Resources:
Create a scatter plot for the data, find a trend line and evaluate the fit by
analyzing residuals
Textbook Correlation: 57, 9-7, CB 9-7, CC-5
NCDPI Unpacking:
S.ID.6 Create a scatter plot from two quantitative variables.
S.ID.6 Describe the form, strength and direction of the relationship.
S-ID.6a Determine which type of function best models a set of data. Fit this type of function to the
data and interpret constants and coefficients in the context of the data (e.g. slope and y-intercept
of linear models, base/growth or decay rate and y-intercept of exponential models). Use the fitted
function to make predictions and solve problems in the context of the data.
S-ID.6b Calculate the residuals for the data points fitted to a function. A residual is the difference
between the actual y-value and the predicted y-value (! !), which is a measure of the error in
prediction. (Note: ! is the symbol for the predicted y-value for a given x-value.) A residual is
represented on the graph of the data by the vertical distance between a data point and the graph
of the function.
S-ID.6b Create and analyze a residual plot. A residual plot is a graph of the x-values vs. their
corresponding residuals. (Note that some computer software programs plot ! vs. residual instead
of x vs. residual. However, the interpretation of the residual plot remains the same.) If the residual
plot shows a balance between positive and negative residuals and a lack of a pattern, this
indicates that the model is a good fit. For more accurate predictions, the size of the residuals
should be small relative to the data.
S-ID.6c For data sets that appear to be linear, use algebraic methods and technology to fit a
linear function to the data. To develop the concept of LSRL, begin by finding the centroid (!, !)
and selecting another point to fit a line through the center of the data. (Note: When describing a
set of one-variable data, the mean is the most common predictor of a value in that data set.
Therefore, the centroid is a logical choice for a point on the line of best fit because it uses the
average of the x-values and the average of the y-values.) Find the sum of the squared errors of
this line and compare to lines fitted to the same set of data (but a different second point) by
others. The Least Squares Regression Line is a line that goes through the centroid and
minimizes the sum of these squared errors.

Standards are listed in alphabetical /numerical order not suggested teaching order.
PLCs must order the standards to form a reasonable unit for instructional purposes.

MARS Problem solving


lesson:
Divising a measure for
Correlation (S.ID.5 & 6)

Course Name: Math I

Unit # 3

Sample Assessment Tasks


Skill-based task
The following data shows the age and average daily
energy requirements for male children and teens (1,
1110, (2, 1300), (5, 1800), (11, 2500), (14, 2800), (17,
3000). Create a graph and find a linear function to fit the
data. Using your function, what is the daily energy
requirement for a male 15 years old? Would your model
apply to an adult male? Explain.
Ex. Connie works for a telephone company. She calls
existing customers to sell them additional services for
their account. The table below shows how much Connie
earns for selling selected numbers of additional services.
Create a scatter plot of the number of services sold and
the daily pay she received.
Services 10
20
30
40
50
Sold
Daily
60
80
100
120
140
Pay

Unit Title: Descriptive Statistics

Problem Task
Collect data on forearm length and height in a class. Plot the
data and estimate a linear function for the data. Compare and
discuss different student representations of the data and
equations they discover. Could the equation(s) be used to
estimate the height for any person with a known forearm
lengths? Why or why not?
Below is the data for the 1919 season and World Series
batting averages for nine White Sox players.
Season Batting Average
.319
.279
.275
.290
.351
.350
.256
.282
.296

Describe, in context, the form, strength, and direction of


the scatterplot from the problem above.

World Series Batting


Average
.226
.250
.192
.233
.375
.056
.080
.304
.324

Calculate the residuals from the plot above. What do they a. Create a scatter plot for the data provided. Is there a linear
represent? Are the points with negative residuals located association? Explain.
above or below the regression line?
b. What is the Least Squares Regression Line that models this
data?
What is the sum of the squared residuals of the linear
c. How do you know this equation is the line of best fit to
model that represents the situation described
model the data?
above? Can you find a different line that gives a smaller
sum? Explain.

Standards are listed in alphabetical /numerical order not suggested teaching order.
PLCs must order the standards to form a reasonable unit for instructional purposes.

Course Name: Math I

Unit # 3

Unit Title: Descriptive Statistics

CORE CONTENT
Cluster Title: Interpret linear models
Standard S.ID.7: Interpret the slope (rate of change) and the intercept (constant term) of a linear model in the context of
the data.
Concepts and Skills to Master:
Explain what the slope means in the context of the situation.
Explain what y-intercept means in context of the data
SUPPORTS FOR TEACHERS
Critical Background Knowledge
Graph data in a scatter plot and determine a trend line.
Determine the slope of a line from any representation
Identify the y-intercept from any representation
Academic Vocabulary
Slope (rate of change), intercept, linear model
Suggested Instructional Strategies:
Resources:

Find and graph data sets from the internet and


discuss the meaning of their slopes and intercepts
in context

Textbook Correlation: 5-7


Data sets: http://www.freestatistics.info

NCDPI Unpacking:
S-ID.7 Understand that the key feature of a linear function is a
constant rate of change. Interpret in the context of the data, i.e.
as x increases (or decreases) by one unit, y increases (or
decreases) by a fixed amount.
S-ID.7 Interpret the y-intercept in the context of the data, i.e. an
initial value or a one-time fixed amount.

Sample Assessment Tasks


Skill-based task

Problem Task
Create a poster of bivariate data with a linear relationship.
Describe for the class the meaning of the data, including the
meaning of the slope and intercept in the context of the
data.

The equation ! = 40 + 2! represents a pay plan offered to


employees who collect credit card applications. What do
the numbers in the rule tell you about the relationship
between daily pay and the number of credit card
applications collected?
Collect power bills and graph the cost of electricity
compared to the number of kilowatt hours used. Find a
function that models the data and tell what the intercept
and slope mean in the context of the problem.

Standards are listed in alphabetical /numerical order not suggested teaching order.
PLCs must order the standards to form a reasonable unit for instructional purposes.

Course Name: Math I

Unit # 3

Unit Title: Descriptive Statistics

CORE CONTENT
Cluster Title: Interpret linear models
Standard S.ID.8: Compute (using technology) and interpret the correlation coefficient of a linear fit.
Concepts and Skills to Master:
Compute the correlation coefficient of a set of linearly related data using technology.
Determine whether the correlation coefficient shows a weak positive, strong positive, weak negative, strong
negative, or no correlation.
SUPPORTS FOR TEACHERS
Critical Background Knowledge
Be able to use graphing technology
Academic Vocabulary
Correlation coefficient, linear fit, positive correlation, negative correlation, no correlation
Suggested Instructional Strategies:
Resources:
Have students enter data into graphing technology,
calculate the regression equation, and interpret what the
correlation coefficient is telling about the data.
NCDPI Unpacking:

Textbook Correlation: 5-7


Data sets: http://www.freestatistics.info

Understand that the correlation coefficient, r, is a measure of the strength


and direction of a linearrelationship between two quantities in a set of data.
The magnitude (absolute value) of r indicates how closely the data points
fit a linear pattern. If r = 1, the points all fall on a line. The closer ! is to 1,
the stronger the correlation. The closer ! is to zero, the weaker the
correlation. The sign of r indicates the direction of the relationship
positive or negative

Sample Assessment Tasks


Skill-based task

Problem Task

The correlation coefficient of a given data set is 0.97.


List three specific things this tells you about the data.

1. Hypothesize the correlation between two set of seemingly related


data. Gather data to support or refute your hypothesis.
2.A couple of friends decided to measure their compatibility by ranking
their favorite activities.
Mary
4
5
2
7

Maria
7
2
4
3

1
3

6
1

Watching tv
Listening to music
Reading
Talking on the
phone
Hanging out with
friends
Shopping
Exercise

a. Using technology, make a scatterplot for the two rankings.


b. Predict what the rs value is. Use the scatterplot to help explain
your answer.
c. Find the Least Squares Regression Line that models this set of
data.
d. Using technology, identify what the correlation coefficient is and
interpret what it means in the context of the data.

Standards are listed in alphabetical /numerical order not suggested teaching order.
PLCs must order the standards to form a reasonable unit for instructional purposes.

Course Name: Math I

Unit # 3

Unit Title: Descriptive Statistics

CORE CONTENT
Cluster Title: Interpret linear models
Standard S.ID.9: Distinguish between correlation and causation.
Concepts and Skills to Master:
Understand the difference between correlation and causation.
Understand and explain that a strong correlation does not mean causation.
SUPPORTS FOR TEACHERS
Critical Background Knowledge
Understand the meaning of correlation
Academic Vocabulary
correlation, causation
Suggested Instructional Strategies:
Resources:
Discuss data that has correlation but no causation
Textbook Correlation: 5-7
(height vs. foot length)
Discuss data that has correlation and causation
MARS Problem Solving Lesson:
(number of M&Ms in a cup vs. weight of the cup)
Interpreting Statistics: A Case of Muddying the Waters
(S.ID.5 through S.ID.9)
NCDPI Unpacking:

S-ID.9 Understand that because two quantities have a strong


correlation, we cannot assume that the explanatory
(independent) variable causes a change in the response
(dependent) variable. The best method for establishing
causation is to conduct an experiment that carefully controls for
the effects of lurking variables. If this is not feasible or ethical,
causation can be established by a body of evidence collected
over time (e.g. smoking causes cancer).

Sample Assessment Tasks


Skill-based task

Problem Task

Give an example of a data set that has strong correlation


but no causation and describe why this is so. Give an
example of a data set that has both strong correlation and
causation and write a description of why this is so.

Find media artifacts that make claims of causation and


evaluate them.

When you have an association between two variables, how


can you determine if the association is a result of a
cause-and-effect relationship?
There is a strong positive association between the number
of firefighters at a fire and the amount of damage.
John said This means that firefighters must be the cause
of the damage at a fire. Is John correct in his reasoning?
Explain why or why not.

Standards are listed in alphabetical /numerical order not suggested teaching order.
PLCs must order the standards to form a reasonable unit for instructional purposes.

You might also like