You are on page 1of 27

Numbers Stem rows Leaf values COL_LAC PAGE NO.

90
0 0 0 1 SUM 3.28
0 0 0 2
0 0 0 3
1 0 1 4 A) Prepare a stem and leaf Plot
1 0 1 5 Sum - Leaf values COL_LAC
1 0 1 6 Stem rows 1 2 3
1 0 1 7 0 0 0 0
1 0 1 8 1 0 0 0
1 0 1 9 2 0 0 0
1 0 1 10 3 1 2 3
1 0 1 11 4 9 9
1 0 1 12
2 0 2 13
2 0 2 14
3 0 3 15
3 0 3 16 B) Prepare a frequency distribution and Histogram
3 0 3 17 bins Frequency
3 0 3 18 0-9 50
3 0 3 19 10-19 26
3 0 3 20 20-29 14
3 0 3 21 30-39 8
4 0 4 22 40-49 2
4 0 4 23
4 0 4 24
5 0 5 25
5 0 5 26
5 0 5 27
6 0 6 28
6 0 6 29 C) Describe the distribution, based on these displays.
With the given 100 data of Rosebowls played from 1902 to 2016, we h
7 0 7 30 intervals between bins as 10 with upper limit and lower limit and have
7 0 7 31
7 0 7 32
7 0 7 33
7 0 7 34
7 0 7 35
7 0 7 36
7 0 7 37
7 0 7 38
7 0 7 39
7 0 7 40
8 0 8 41
8 0 8 42
8 0 8 43
8 0 8 44
8 0 8 45
8 0 8 46
9 0 9 47
9 0 9 48
9 0 9 49
9 0 9 50
10 1 0 1
10 1 0 2
10 1 0 3
10 1 0 4
10 1 0 5
11 1 1 6
11 1 1 7
11 1 1 8
12 1 2 9
13 1 3 10
13 1 3 11
13 1 3 12
13 1 3 13
14 1 4 14
14 1 4 15
14 1 4 16
14 1 4 17
14 1 4 18
14 1 4 19
16 1 6 20
16 1 6 21
17 1 7 22
17 1 7 23
17 1 7 24
18 1 8 25
18 1 8 26
20 2 0 1
20 2 0 2
20 2 0 3
21 2 1 4
21 2 1 5
23 2 3 6
24 2 4 7
25 2 5 8
26 2 6 9
27 2 7 10
28 2 8 11
28 2 8 12
29 2 9 13
29 2 9 14
31 3 1 1
32 3 2 2
33 3 3 3
33 3 3 4
35 3 5 5
36 3 6 6
36 3 6 7
39 3 9 8
49 4 9 1
49 4 9 2
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
1 1 1 1 1 1 1 1 1 2 2 3 3 3 3 3 3 3 4 4 4 5 5 5 6 6 7 7 7 7 7 7
0 0 1 1 1 2 3 3 3 3 4 4 4 4 4 4 6 6 7 7 7 8 8
1 1 3 4 5 6 7 8 8 9 9
3 5 6 6 9

HISTOGRAM
60
50
40
FREQUENCY

30
20
10
0
0-9 10-19 20-29 30-39 40-49
BINS

displays.
from 1902 to 2016, we have taken the number of bins as 6 and the
and lower limit and have prepared the histogram.
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
7 7 7 7 7 8 8 8 8 8 8 9 9 9 9
PAGE NO.90
SUM 3.29

A) Prepare a dot plot


B) Prepare a frequency distribution and histogram

HISTOGRAM
50
bins frequency
45
0-4 45 40
5-9 8 35
10-14 3

FREQUENCY
30
15-19 1 25
20-24 1 20
24-29 2 15
60 10
5
0
0-4 5-9 10-14 15-19
BINS

C) Describe the distribution, based on these displays.


With the given 60 data of the calls initiated during July, we have taken the number of bins as 6 and the interval
an upper limit and lower limit and have prepared the histogram.
HISTOGRAM

5-9 10-14 15-19 20-24 24-29


BINS

er of bins as 6 and the intervals between bins as 5 with


PAGE NO.90 SUM 3.30
Batting Averages for the 2006 New York Yankees A) Construct a frequency distribution
Let us multiply the averages with 1000
Average *1000 bins Frequency A) Explain how you chose the n
0.343 343 0-49 1 After we multiplied the average
which was helpful in choosing t
0.285 285 50-99 0 have used 9 bins.
0.29 290 100-149 1
0.342 342 150-199 2
0.277 277 200-249 6
0.28 280 250-299 9
0.253 253 300-349 5
0.281 281 350-399 0
0.24 240 400-449 1
0.239 239 25
0.33 330
0.302 302 B) Make a Histogram and describe its appearance.
0.298 298
0.212 212
0.207 207 HISTOGRAM
0.256 256 10
0.228 228 9
0.24 240 8
0.143 143 7
0.167 167 6
Frequency

0.3 300 5
0.417 417 4
3
0.25 250
2
0.167 167
1
0 0 0
0-49 50-99 100-149 150-199 200-249 250-29
Bins

This Histogram is Moderately Skewed left where we


uency distribution C) Repeat using a different number of bins and different

A) Explain how you chose the number of bins and bin limit? bins Frequency
After we multiplied the averages with 1000 we got a solid data 0-99 1
which was helpful in choosing the bins and its limits where we
have used 9 bins. 100-199 3
200-299 15
300-399 5
400-499 1
25

HISTOGRAM
16
14
m and describe its appearance. 12
10

Frequency
8
HISTOGRAM
6
4
2
0
0-99 100-199 200-299 3
Bins

D) Did your visual impression of the data change when


Yes, The first histogram was Moderately skewed left whe
99 100-149 150-199 200-249 250-299 300-349 350-399 400-449 changing the number of bins and the bin limits changed
histogram. We could see a longer tail to the right here.
Bins

Moderately Skewed left where we can see a longer tail to left

Moderately Right skewed


a longer tail to the right
ber of bins and different bin limits

HISTOGRAM

99 200-299 300-399 400-499


Bins

f the data change when you changed the bins, Explain.


derately skewed left whereas the second one after
d the bin limits changed into Moderately right side skewed
er tail to the right here.
PAGE NO.95 SUM 3.45

A) Use Excel to prepare and appropriate type of chart.

Airline Percentage
Alaska 4.5 Per centage of Dom esti c Mar k et Sha
American 18.40
Delta 16.9 Alaska American Delta Frontier Hawaiian
Frontier 2 Sky west Southwest Spirit United Others

Hawaiian 1.7
Jet blue 5.4
Sky west 2.4
Southwest 18.3
Spirit 2.7
United 14.5
Others 13.2
Total 100

B) Would more than one kind of display be acceptable? Why or Why not?
Representation of date in a single chart is more acceptable. There is no need to present the data in
more than one type of chart.
The data here shows us the domestic market share of the ten largest U.S Airlines and as its a simple
data it can be represented in a pie chart for a better understanding.
c Mar ket Shar e of Airlines
Delta Frontier Hawaiian Jet blue
pirit United Others

the data in

its a simple
MINI CASE:4.1

3 10 15 15 20 20 20 22 23
30 30 35 35 36 39 40 40 40
50 50 50 53 55 60 60 60 67
90 96 100 100 100 100 100 103 105
130 131 139 140 145 150 150 153 153
170 176 185 198 200 200 200 220 232
260 268 270 279 295 309 345 350 366
450 450 474 484 495 553 600 720 777
1,020 1,050 1,200 1,341

Inference
1 We could see that the majority of the ATM deposit are small amount ranging from 3 to 200.
2 The most deposited amount is 100
3 The amount ranging from 200 to 1341 are very rarely deposited
4 The amount ranging from 200 to 1341 are mostly deposited only once.
Deposits Distribution Count
25 26 26 3 3 1
40 47 50 10 10 1
75 78 86 15 15 2
118 125 125 15 20 3
156 160 163 20 22 1
237 252 259 20 23 1
375 431 433 20 25 1
855 960 987 22 26 2
23 30 2
25 35 2
26 36 1
26 39 1 0 100 200 300 400 500 6
30 40 4
30 47 1
35 50 4
35 53 1
36 55 1
39 60 3
40 67 1
40 75 1
40 78 1
40 86 1
47 90 1
50 96 1
50 100 5
50 103 1
50 105 1
53 118 1
55 125 2
60 130 1
60 131 1
60 139 1
67 140 1
75 145 1
78 150 2
86 153 2
90 156 1
96 160 1
100 163 1
100 170 1
100 176 1
100 185 1
100 198 1
103 200 3
105 220 1
118 232 1
125 237 1
125 252 1
130 259 1
131 260 1
139 268 1
140 270 1
145 279 1
150 295 1
150 309 1
153 345 1
153 350 1
156 366 1
160 375 1
163 431 1
170 433 1
176 450 2
185 474 1
198 484 1
200 495 1
200 553 1
200 600 1
220 720 1
232 777 1
237 855 1
252 960 1
259 987 1
260 1020 1
268 1050 1
270 1200 1
279 1341 1
295
309
345
350
366
375
431
433
450
450
474
484
495
553
600
720
777
855
960
987
1,020
1,050
1,200
1,341
Count

00 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
ATM Deposit
MINI CASE :4.3
Peak trough duration in months S&P loss(%)
Sep 1929 Jun-32 34 83.4
Jun 1946 Apr-47 11 21
Aug 1956 Feb-57 7 10.2
Aug 1957 Dec-57 5 15
Jan 1962 Jun-62 6 22.3
Feb 1966 Sep-66 8 15.6
Dec 1968 Jun-70 19 29.3
Jan 1973 Sep-74 21 42.6
Jan1977 Feb-78 14 14.1
Dec 1980 Jul-82 20 16.9
Sep 1987 Nov-87 3 29.5
Jun 1990 Oct-90 5 14.7
Jul 1998 Aug-98 2 15.4
Sep 2000 Mar-03 21 42
Oct 2007 Mar-09 17 50.6

Inference
The Investor has to wait for 14 months as the statisitics indicate the mean duration to be 13.5 months.
With the S&P Loss mean of 28.17 % also, we can infer that we have to wait more than 14 months when the loss is a
Statistic Duration S&P Lioss %
Count 15 15
Mean 13.53 28.17
Median 11 21
sample SD 9.96 19.58
Minimum 2 10.2
Maximum 34 83.4
Coeff of variation 73.60% 69.5
Mean absolute Deviation 8.17 14.45

duration to be 13.5 months.


more than 14 months when the loss is above 8.17%.
MINI CASE 4.4
Year N Mean SD Coeff. Var Min Median Max
2004 37 125.4 22.89 18.3 87 121 173
2006 37 134.5 24.94 18.5 91 132 2014
2008 36 125.2 19.7 15.8 87 121.5 167
2010 33 114.7 18.07 15.8 83 113 170
2012 34 105.5 19.62 18.6 73 106.5 151
2014 32 119.7 22.89 19.1 74 116 206
2016 33 114.8 25.97 22.6 83 112 216

Inference
From the year 2010 the number of defected per 100 vehicles has decreased on the avg mean and median
The Coefficient of variation and extremes Min n Max do not show strong difference although a lit difference is seen in 2016
200 Defect rate
180

160

140

120

100

80
2004 2006 2008 2010 2012 2014 2016
t difference is seen in 2016

N Mean SD Coeff. Var Min Median Max


MINI CASE:4.6
Year Min Q1 Q2 Q3 Max
2004 87 105 121 148.5 173
2006 91 119.5 132 148 204
2008 87 112.3 121.5 140.3 167
2010 83 106 113 126 170
2012 73 94 106.5 117.3 151
2014 74 108 116 130 206
2016 83 96 112 122.5 216

Inference
The box plot shows that there is decline in the defect rates over year
It is norrowed after 2004 abut again widened in 2016
Higher outlier are commonly seen in 2006,2010,2014 and 2016
Lower outlier is seen only during 2014
There are high extreme in 2014 and 2016
Year
PAGE NO.196 SUM 4.73

P(A) People suffering from schizophrenia 0.015


P(B) People not suffering from schizophrenia 0.985

Let D be the event that people have brain atrophy


P(D/A) People having brain atrophy given that they also suffer from schizophrenia 0.3
P(D/B) People having brain atrophy given that they do not suffer from schizophrenia 0.02

A) P(A/D) People suffering from schizophrenia given that they have brain atrophy
P(D/A)*P(A) / [P(D/A)*P(A) + P(D/B)*P(B)]
(0.3)*(0.015)/[(0.3)*(0.015)+(0.02)*(0.985)]
0.1859

B) It hurts the case because P(A/D) is smaller than P(D/A)

C) It can be argued that 0.015 is not a reasonable prior probability


Hence, using 0.10 as the prior probability of schizophrenia, P(A/D)=?

P(A) People suffering from schizophrenia 0.1


P(B) People not suffering from schizophrenia 0.9

Let D be the event that people have brain atrophy


P(D/A) People having brain atrophy given that they also suffer from schizophrenia 0.3
P(D/B) People having brain atrophy given that they do not suffer from schizophrenia 0.02

P(A/D) People suffering from schizophrenia given that they have brain atrophy
P(D/A)*P(A) / [P(D/A)*P(A) + P(D/B)*P(B)]
(0.3)*(0.1)/[(0.3)*(0.1)+(0.02)*(0.9)]
0.625

D) If the prior probability is 0.10 it helps the case


P(A/D) > P(D/A)

E) Hence, using 0.10 as the prior probability of schizophrenia, P(A/D)=?

P(A) People suffering from schizophrenia 0.25


P(B) People not suffering from schizophrenia 0.75

Let D be the event that people have brain atrophy


P(D/A) People having brain atrophy given that they also suffer from schizophrenia 0.3
P(D/B) People having brain atrophy given that they do not suffer from schizophrenia 0.02

P(A/D) People suffering from schizophrenia given that they have brain atrophy
P(D/A)*P(A) / [P(D/A)*P(A) + P(D/B)*P(B)]
(0.3)*(0.25)/[(0.3)*(0.25)+(0.02)*(0.75)]
0.8333

On comparing P(A/D) to P(D/A), we can find that this is a very strong support to the case
t to the case

You might also like