You are on page 1of 8

Project 1 How Much Crime?

SAMPLE
SOLUTIONS
Name: __________________
Section: _________________
Part 1 Where Should I Move?
State 1
(home):
State 2:

Indiana

State 3:

Wisconsin

Crime
1:
Crime
2:

Ohio

Burglary
Motor Vehicle Theft

Why did you choose the range you did?


I choose this range because it was the most current 15 years of data and I wanted to see the
current trends for these states, since I care about what it happening now.
What considerations did you take into account when choosing this date range?
I felt 15 years was a good range of information and the most current information available for this
website.

Which of the three tables will give you the information you need?
The first selection State by state and national estimates gives you one Excel file that has as
many different states as you want, as well as many of the different kinds of crimes.
The second selection Data with One Variable only allows for one kind of crime, but multiple
states. You would have to run this twice to get the data for two different crimes and combine
the tables into one Excel file.
The third selection One Year of Data allows you to choose multiple states and crimes but only
for one year, so you would have to run this 15+ times to get all of the information and combine
all the different tables.
Why did you choose this table?

I chose table 1 State by State and National Estimates because it gives the information in one Excel file
and I can just move it around so it is all in the same rows to see the data side by side (if I prefer it like
that).

Either selection 1 or 2 would be sufficient choices because they both have the information and
only require 1 or 2 queries. Selection 3 would not be a good choice because it requires a lot of
aggregation.

Crime 1 Burglary
Statistical
Measure
Mean

State 1
Indiana
45477.19

State 2
Ohio
98991.63

State 3
Wisconsin
26919.25

Median

46011.5

99106.5

26982.5

Mode

None

None

None

Minimum

41108

87023

23854

Maximum

50571

112901

29740

Range

9463

25878

5886

Lower
Quartile
Upper
Quartile
Standard
Deviation

42569.5

95055.25

26317.75

48279

103568

27899

3110.393

6934.258

1600.208

Crime 2 Motor Vehicle Theft


Statistical Measure

State 1 - Indiana

State 2 - Ohio

State 3 - Wisconsin

Mean

19210.25

34962.81

12212.44

Median

20530.5

39109

13017.5

Mode

#N/A

#N/A

13458

Minimum

13500

19512

8152

Maximum

25099

45419

15640

Range

11599

25907

7488

Lower Quartile

16828.75

27003.25

10763.5

Upper Quartile

21265

41660.5

14073.5

3486.082

9182.128

2542.16

Standard Deviation

What does this data tell us? Choose 3 variables that we need to consider when comparing states
that this data does not tell us and explain why we need to consider them.
Sample Responses:
This data simply tells you the reported crimes for the state. Three variables to consider when comparing
states:
1) Size of State states with more people will likely have more crimes. We cannot just compare the
reported number of crimes for a state without looking at a population
2) Location of Crimes (Geographic Factors) it does not tell you where in the states the crimes occurred.
The state could have a large city where the majority of the crimes occurred and have many smaller cities
without much crime
3) Accurracy of data if some of the information is missing for a certain data range within a year, a
mathematical estimate is used.
5) Accurracy of reporting this data is reported data, not necessarily all the crimes that may have
occurred. If an area has more sophisticated reporting data or simply more people, then they may have
more crimes reported, but this doesnt mean they actually have more crimes than another area with
less personelle/technology.

Part 2 Tell Me More


What factors did you consider when making this decision?
I chose the state of Wisconsin because based off of its population, it had the lowest percentage of
burglary and motor vehicle theft.

Responses should include some mathematical reasoning why they chose the state. They need
to take into consideration some of the variables like population size when comparing the data,
not simply the raw numbers

Box Plot

Crimes

Box Plot - Crimes in WI, 1997-2012

Motor Vehicle
Theft

Burglary

32000
30000
28000

26000
24000
22000
20000

18000
16000
14000
12000

10000
8000
6000
4000

2000
0

Number of crimes

Scatter Plot

Crimes

Scatterplot - Crimes in WI, 1997-2012


32000
30000
28000
26000
24000
22000
20000
18000
16000
14000
12000
10000
8000
6000
4000
2000
0
1996

1998

2000

2002

2004

2006

2008

Year
Burglary

Motor vehicle theft

2010

2012

2014

Histogram - Motor Vehicle Thefts in


WI, 1997-2012
6

Frequency

5
4
3
2

Frequency

1
0

Crimes

Histogram

Histogram - Burglary in WI, 19972012


12

Frequency

10
8
6

4
Frequency

2
0

Crimes

What did you choose for your scatter plot axis and why?
Something should be mentioned about choosing an axis with increments to clearly show all the
data with values in the thousands, it should be some increment in the 1000s
What did you choose for your bins for your histogram and why?
This should mention choosing a bin size where the majority of bins have at least 1 value, but the
bins should not be full with all the data.
If you change the bin size, does this give a different representation of the data?
If the bin size was increased, there would be a fewer number of bins with information in it and it
would be lumping all of the data together. If it was decreased, it would end up showing all the
data and not grouping information at all.
Does the distribution of your histogram look like a normal distribution?
This should mention that it is/is not a normal distribution.
Why or why not? What does this tell us?
The answer could mention symmetry, skewness, or the number of peaks. The learner could also
mention the amount of variation (how spread out the values are).
Part 3- Conclusion
Using your statistical data and charts, make conclusions about the state you chose. Include any
observations about overall trends in the data, any skewness or variation in your graphs, and if
there are any outliers in your data. Explain whether you think the mean, median, or mode is the
best representation of the data you collected and any other considerations that this data does not
account for when considering the rate of crime in a state
For this project, the scatter plot is likely to be the most useful graph to describe the trends
for crime. Students could talk about increases/decreases of crime, years when it peaked, and
comparisions between the kinds of crime. Students should mention the reason why there is not
likely a mode for this kind of data, and any variation between the mean and the median.

BONUS
Calculate the standard score for the data for each year of your chosen state (HINT: see
p. 388 to determine how to do this in Excel).
What does this tell you about your state and the data?
Students should mention how the standard scores were distrubted and what that
means.

Would you expect this kind of data to have a normal distribution? Why or why not?
Most likely not, due to the nature of the data. Crime does not typically follow a normal
distribution, like IQ scores would.

Year
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012

Population
5170000
5224000
5250446
5363675
5405947
5439692
5474290
5503533
5527644
5556506
5601640
5627967
5654774
5691659
5709843
5726398

Standard
Score
Burglary
1.614634
1.76274
-0.8038
-1.08502
0.004843
0.629137
-0.23325
-1.91553
-1.54433
0.074209
0.60664
0.3373
-0.0664
-0.17701
0.154824
0.64101

Standard
Score
MV
Thefts
1.348286
0.785773
0.631967
0.953347
0.987176
0.489961
0.059225
-0.32981
0.143406
0.71418
0.489961
-0.26412
-1.29041
-1.59724
-1.54964
-1.57206