You are on page 1of 8

Eric Chang

4/17/15
IEMS 303 HOMEWORK 2
1.

How many motor vehicle thefts are recorded in the dataset? How many variables are in the
dataset? (Beat is the smallest regional division defined by the Chicago Policy Department. District is a
larger regional division. Each district contains several beats.)

There are 191641 motor vehicle thefts reported, and 13 variables reported.
2.

Which month sees the fewest motor vehicle thefts? Which weekday sees the most motor vehicle
thefts?

February sees the fewest MVTs while Friday sees the most MVTs.
3.

For what proportion of motor vehicle thefts in 2001 was an arrest made? What about 2007 and
2012? Can you provide any reason for the trend?

In 2001, 10.4% of MVTs resulted in an arrest. In 2007, 8.4% of MVTs resGAulted in an arrest. In
2012, 3.9% of MVTs resulted in an arrest. This could be due to the increasing efficacy of police
activity lowering the crime rate over time.
4.

What are the top five locations for motor vehicle thefts, excluding the Other category?

The top 5 locations excluding Other are: Street, Parking Lot/Garage (Non residential), Alley,
Gas Station, and Driveway (Residential.
5.

On which day of the week do the most motor vehicle thefts at gas station happen? What about
the residential driveways?

Most motor vehicle thefts at gas stations occur on Saturdays. Most motor
vehicle thefts on residential driveways occur on Thursdays.

Section 1.2 Qn 22:


From the histogram, it seems that the runners tend to slow down near the end of the race as a vast
majority of the values in the graph are positive. It is an interesting feature that the histogram tapers off
slowly to the right (high time difference) but climbs more rapidly at the left (low time difference). A typical
value for time would be 100. Only a very small portion of the runners ran the later part of the race faster(a
negative value), perhaps about 2-3%.
Section 1.2 Qn 25 (mark frequency on the verticle axis):

Qn 25 (cont.):
The transformation makes the data look less
skewed and more uniformly distributed.

Section 1.3 Qn 33:


a)

Mean=640.5
Median=582.5

b)

Mean=610.5
Median=582.5
The mean decreases by 30. The median does not change.

c) Eliminate 10/5 = 2 outliers from each side.

Mean=591.17

d) Need to perform linear interpolation between 10% and 20% trimmed mean since 10*.15 is not an
integer.

tMea n10 =596.25


tMea n20 =591.17

596.25+591.17
=593.71=tMea n15
2

Section 1.3 Qn 43:


Only medians and trimmed means can be calculated; the 100+ value is not numerical and cannot be
factored in.

Median=68
tMea n20 =66.2

Section 1.4 Qn 51:


a)

2=1264.77
=35.6

b) Divide variance by

( )

=.351
=.592

1
60

and standard deviation by

1
60

Section 1.5 Qn 53:


a)

The mean and median of expense ratio are greater for Growth funds than for Balanced funds, but
variance is smaller for Growth than Balanced.
b) Construct a comparative boxplot for the two types of funds, and comment on interesting features.

Typical values for both types of funds are very close. However, there is a bigger
variance on Bl than Gr.

You might also like