You are on page 1of 24

2.

CIIAPTER 2
..o..O..o....O...............t.......o..............................

Descriptive Statistics
data' Excel was used to create the following bar graphs for the

PhD Degrees Awarded bY Year


7000
:E

6000 uooo

$ +ooo ct) E 30oo

?
.ct

zooo 10oo

zo

1999

20oo

2001
fear

2002

Doctoral Enrollment
50000 45000 40000 - 35000 soooo E zsooo o E 2oooo

'

lsooo
10000

5000
0

2000

2001

Year

in the number of doctoral degrees We can see from the graph that there has been little change the enrollment has increased awarded during,nir,i".i-rtt" years. we also see that graphs support the statements made by the dramatically oln", ttr"rJ-ru-"-dr,,u y"urr. These bar American Society of Engineering Education'

Chapter 2

Descr

2.3

Beach hotspot

will

be one of several locations. Therefore


be one

it is a qualitative variable.

Beach condition

will

of several conditions. Therefore it is a qualitative variable.


be one of several conditions. Therefore

Nearshore bar condition

will

it is a qualitative

variable. Long-term erosion rate will be a number. Therefore it is a quantitative variable.


b.

Excel was used to fonn

a pie chart

for beach condition of the six hotspots.

Beach Condition

Excel was used to form a pie chart for nearshore bar condition of the six hotspots.
Neanshore Bar Condition

d.

These pie charts describe the conditions of these six beach hotspots. They should not be used to make inferences about beach hotspots around the cout.rtry.

Descriptive Statistics

2.7

2.5a'Excelwasusedtocreateabargraphforthesouroesofunauthorizedeomputeruseln
t999.

1999 Unauthorized GomPuter Use in


45

40
35

r3o Et 925 c 8zo

trs

10

Excelwasusedtocfeateabargraphforthesoufcesofunauthorizedcomputerusetn
2001.

2001 Unauthorized Computer Use in


50

2.9

45
40 35

$.0 825 tzo G1s


10

Thepercentageofattacksfromunknownsouraeshasincreasedbetweenlgggand200l pretty dramaticallY'

Chapter 2

Det

2.7

Excel was used to create

a Pareto diagram

for the first digit results'

First
140

S ig

nificant Digit

120 100
.3I

o
E

80 60

40
20
0

b.

The results do not support Benford's Law. Benford's Law states that the value " I " should occur first most often, around 30olo of the time. We see in these results that the value .,6" occurred most often. In addition, the value "1" occurred much less than 307o of the time. In fact, it occurred just 14.67%o (1091143> of the time.

2.9

Using MINITAB a bar chart for the Extinct status versus flight capability is:

80
7A

60 50

It appears that extinct status is related to flight capability. For birds that do have flight most of them are present. For those birds that do not have flight capability, most "uputitity, are extinct.

Descriptive Statistics

2.tt
Density is: The bar chart for Extinct status versus Nest

present' to nest density' Th9 grgnortion of birds It appears that extinct status is not related high and nest densitlz low' rot tt"tt density u".v absent, and extinct d;;;;" "i*1*

Habitat is: The bar chart for Extinct status versus

(TA)' most to habitat' For those in aerial terrestrial It appears that the extinct status is related extinct' For those ground terrestrial (TG), most species are species are present. F;;ffi; in in aquatic, most species are present'

Chapter 2

Des

2.11

a.

Using MINITAB, the dot plot for the 9 measurements is:


,:ndplot of C6lum

b.

Using MINITAB, the stem-and-leaf display ts:


Character Stem-and-Leaf DisPlaY

Stem-and-leaf of Cesium Leaf Unit = 0.10 1 2 4 (3) 2 -5 0 -55 -500 -4 865 -47r

N :

Using MINITAB, the histogram is:


H

lstoCram:6f i.Cesiuin

g
o

,$

d.

The stem-and-leaf display appears to be more informative than the other graphs' Since there are only 9 observations, the histogram and dot plot have very few observations
per category. There are 4 observations with radioactivity level of -5.00 or lower. The proportion : measurements with a radioactivity level of -5'0 or lower is 4 / 9 -444.

of

Descriptive Statistics

2.13

that measure between l '5 and 2'5 on the To estimate the percentage of the aftershocks

2.17

Richterscale,wemustsumthepercentag"ro.rthecorrespondingbarsofthebargraph' Doing so, we estimate this percentage as 65Vo'


b.

Toestimatethepercentageoftheaftershocksthatmeasuregreaterthan3'0onthe corresponding bars of the bargraph' Richter scale, we must sum the percentag", on the

O"i"g
2.15

so, we estimate this percentage as t2%o'

display: Statistix was used to generate the following stem-and-leaf

stem and Leaf Plot of Sanitation

Leaf Digit Unit : 1 1 4 rePresents 74 '

Mini.mum

74 - 000

Median

95.000 Maximum 100.00

Sten Leaves t'74 l1 3199 B 11 5 58 685 B 66777"7 L2 I 999 15 9 000000000001"1-11111111 36 g 22222222223333333333333333 62 (31) g 444444444444444455s55555555555s 7'7 "1111 g aae e e e a666666667'7 7'7 7 71117 7't 1'717 81 g eBBeeeeEBBBBSBBgg 9999999999999999 44 11. 10 00000000000
the ones digit of the sanitation score for The stems are the tens digit and the leaves are the ship.
b.

b.

higher. We estimate

168ofthe174ships,or96'6Vooftheships'registeredasanitairo,nscoreof36or to tftipt thit have an accepted sanitation standard at pt"p""iott "f " be.966.
74 is the lowest saore on the stem-and-leaf display'

c.

Chapter 2

Descriptiv

2.17 a.

Statistix was used to construct the following histogram for the pH levels of the sampled wells:

>\ o

c q) 3 g
(l)

LL

.O

7.5
pH Level

8.O

It appears that roughly 58 of the 223 pH levels, or 58/223: .26,have


than 7.0.

a pH value less

b.

Statistix was used to construct the following histogram for the MTBE values of the wells with detectible levels of MTBE:

I lr

>' o c (u =40 s

15

2A

25

30

3s

70 cases

pH Level plotted 185 missing cases

It appears that roughly 6 of the 70 MTBE levels, or 6/'70: .086, have a MTBE value that exceeds 5.

Descriptive Statistics

ll

2.19

L'

Fromthefrequencyhistogram,thenumberofcopperparticleswithdiametersranglng from 5 to 7 nanometers is 130'

b.

Toconvertafrequencyhistogramtoarelativefrequencyhistogram'wemustfirst + 70 frequencieJ, which is 35 + 130 divide each of the frequencies by the sum Jail the i ts * 5 :255. The rbhtive frequency table is:
Class

Interval Freq-uency
3-5
5_:7

R9lLtilv-e Fr-eg-uency

35 130

7-9 9-11 1l-13

70
15
5

l30l255:.510 70/255: .275 lvrz)) : :.059 15/255: 5/255: .O2O

istzss: .t37

2.23

The relative frequency histogram is:

.a7t
.112

F g
p

.g
.zts .l

.rrt
.oBt

2.25

c.

9 nanometers in diameter is The proportion of copper particles exceeding : approximatelY 2O/255 .078'

2.21

a.

First, rank the data in order: 3, 4, 5, 8,


__ mean: y

l0

_ =-T.y :4=6
n5 :
middle (3'd) measurement:
5

median: rr

mode: Most frequently occurring value


Since all values occur only once, all are modes'
b.

First, rank the data in order: 2,4,

4,5,6' 6'9' lZ

mean:

T v 49=6 =7=n
Descri

Chapter 2

median:

m:

average

of the middle (4d' and 5tl') measurements

5+6 l l 22
mode: Most frequently occurring value
Mode
2.23

4 and 6 (both occur twice)

First, arrange the data in order:

mean:

s7 Lv --:---:2.425 Y :"' n40

The arithmetic average of the 40 observations is 2.425'

median: m:

average of the rniddle

two (20d'and 21't) measurements

2+2
Half of the 40 observations fall below 2 and half fall above 2. tnode: Most frequently occurring value

Mode:2
Because there is only a slight degree of skewness in the data, the preferred measure of central tendency would be the mean.

2.25

First, order the data.

\--r, mean: = - Lr/ /otd. n

)oL1l
30

=ry!=9.422 /n"* = n3O


The arithmetic average of the thirty observations for the old and new processes are 9.804 and 9.422, respectively.

Z'

median:

m:

average of the middle

two observations (l5th and

16th)

9.97 +9.98 *"rc:'1i:=9'975 9.43 +9.48 ffin*:T:9.455

Half the values in the old process fall below 9.975. Half the values in the new process fall below 9-455. mode: Most frequently occurring value

Descriptive Statistics

l3

Mon :The following values all occurred twice:


8.72, 8.80, g.84,9.87,9.98, l0'15, l0'26

Mo.n- 8'82
mean' preferred measure of central tendency is the Because of the symmetry in the dat4 the

2.27

First arrange the data in order:

.lL2, .2O5, .225,.23g,.241,.27O, '27O, '33O, '375' '523', '59l', '618 mean:

2.3

V =7:-12 m:

IY

3.sss :'5J5

averageof the middle two (6e and 7s) measurements

.27O+.270
2

- .1^
--2.=J

mode: Most frequently occurring value

Mode:

-270

preferred to the right, the mean would probably be Although the data set is slightly skewed to th;;;;i"" and mode because it has nicer properties'
2.29

)1

a.

student #11 is: The mean cylinder power measurements for

_r/

n - --.-: n

Zy

-1'9u --.08+(-'06)+"'+(-'16) = 25 25

= -.1544

Themedianisthemiddlenumberoncethedatahavebeenarrangedinorder:

-|.07,-.21,_.20,-,17,_.I7,_.17,_.|6,-.16,_.!6,_.15,_.|2,-.12,_.11,-.10, -'06' -'06' -'04 -.09, -.09;-.09, -.08, -'08, -'07, -'o7 ' -'06'
The median is -.1 1'

Themodeisthevaluewiththehighestfrequency.since-'17,-'16'*'09'-'06each
ocour 3 times, all are modes'

2:.

Fromtheprintout,themeanis_0.1544andthemedianis_0.11.(Allthenumbersin are very close together' both


two values the table are negative numbers.) Since these of the middle of the data set' are good repres6ntatives

b.

The outlier is -1'07'

After deletine -1

.Ol , the mean is:

_'> 10

24 The median is the average of the middle two numbers once the data are arranged in order. The middle two numbers are 10 and I

l.

The median

The mean changes from -.1544 to -J1625 -. 1 I to mean changes much more than the median when the outlier is removed. -.105. The
2.31

2 while the median changes from

t"

(-'

10) +

(*' l l)

: -.105.

Range:

1.55

1.37:.18 t7.34s3-tt'772

b.

.s2:u"

r,

n-I

-trd n

8-1

:.0041

":
d.

\foo+r :.o64
is

If the standard deviation of the daily amrnonia levels during the morning drive-time

L45 ppm (compared to .064 ppm in the afternoon drive-time), then the morning drivetime has rnore variable ammonia levels.
2.33
a.

The data appears to be moundshaped in nature. We use the Empirical Rule to describe this data set. We expect approximately 68Yo of the observations to fall within the interval y t s. We expect approximately 95Yo of the observations to fall within the interval ! + 2s. We expect approximately IOO%o of the observations to fall within the interval V +3s. From the printout, we see

b.

t:

0.8425 and s

: =

0.3455.

,
c.

+ 2s

0.8425 *2(0.3455)

(0.1515,1.534)

Based on the Empirical Rule, we would expect approximately 95o/" of the observations to fall within the interval.

d.

From the histogram, it appears that roughly 9l%o of the observations fall in the interval. This is very close to the expected95%o. We use the Empirical Rule to describe this data set. We expect approximately 68%o of the observations to fall within the interval y * s. We expect approximately 95%o of the observations to fall within the interval y + 2s. We expect approximately I 00oZ of the observations to fall within the interval V +3s. We get y + 2s

2.35

1.000 + 2(0.950)

(-0.900, 2.900)'

Descriptive Statistics

l5

We use the Empirical Rule to describe this data set. We expect approximately 68%o of the observations to fall within the interval y + s' We expect approximately 95Yo of the observations to fall within the interval y + 2s' We expect approximately 100% of the observations to fall within the interval T + 3s. We get

2.43

+2s

4.560 + 2(10.390)

(-16.220,25.340)'

2.37

We find the summary information and the stem-and-leaf display for the data set from Statisix:

Variable
FRP

10
Stam Leawes

234.74

Mean

sD
9.9075

Minimum

2r5.70

I'Iaxinum

2.45

248.90

r2t5 L22 2225 5 23 52358 3240 2 24

013 78

Since the shape is not moundshaped, we use Chebyshev's Rule to describe where data falls. we expect atleast 07o of the observations to fall within the interval y + s' we expect at least

fall within the interval observations to fall within the interval V + 3s .


5%o

of

the observations to

/t

2s. We expect at least 89%o of the 2.47

We expect most of the bearing strengths to fall in the interval

y + l5

'

y + 3s

234.74 + 3(9.9075)

(204-815,264'665)

2.3g a.

From the histogram, the data do not follow the true mound-shape very well. The intervals in the middle are much higher than they should be. In addition, there are some extremely large velocities and some extremely small velocities. Because the data do not follow a mound-shaped distribution, the Empirical Rule would not be
appropriate.

b.

Using Chebyshev's Rule, at least I - l/42 or I - l/16 or l5/16 or 93.8o of the velocities *ill fall within 4 standard deviations of the mean. This interval is:
V + 4s

2.49

27,117 + 4(1,280)

* 27,ll7

+ 5,12O

(21,997, 32,237)

At least 93.7syo of the velocities will fall between 21,997 and32,237 km per second.

c. 2.41

Since the data look approximately symmetric, the mean would be a good estimate for the velocity of galaxy cluster A2142. Thus, this estimate would be27,l l7 km per
second.

The 75'h percentile is the point in the distribution in which 75Yo of the data values fall below and25Yo ofthe data values fall above. 75Yo of the TP concentrations at the 28 Everglades

l6

Chapter 2

Descr

sites had levels that fell below l0 micrograms per liter. T'he 75'r'percentile was chosen most likely to identif, the upper quarter of the sites that the DEP wanted to label as unsafe.

2.43

The mean cyanide concentration is 84.0 and the median is 28.8. Since the mean is much greater than the median, the data are skewed to the right. Since the data are not moundshaped, the Empirical Rule does not apply. The variance is 6,400, so the standard

deviation is 80. The upper quartile is 88.5. Thus,75Vo of all the measurements are less than 88.5.

2.45

From Exercise 2.15,

-- 94.42 and s

4.380.

The z-score for the Nautilus Explorer is:

*t s

74-94.42
4.380

: _4.66

The score for the Nautilus Explorer is 4.66 standard deviations below the mean for all the cruise ships.

b.

Thez-scorefortheRotterdam

is:, :/-l

:*o'32 4.380 s -93-9!'42

The score for the Rotterdam is 0.32 standard deviations below the mean for all the cruise ships.
2.47

For the old location


10.50

,, :

9.804 and s : .541. The z-score

for I 0.50

is

.s41
b.

-9.804 -?o

For the new location,

-- 9.422 and s

.479. The z-score fbr 10.50 is

10.50-9.422
.479

-1.t<

The closer the z-score is to 0, the more likely it is to occur. Thus, a voltage reading 10.50 is more likely to occur at the old location because the z-score is closer to 0.
2.49
a.

of

The median is the value in which half the data fall above and half fall below. the sampled clinkers have measurements below 170 mg/kg.

50%o

of

b.

The lower quartile is the point in the distribution in which 25%" of the data values fall below and 7 5Yo of the data values fall above. 25o/o of the sampled clinkers have
measurements below 115 mg/kg.

The upper quartile is the point in the distribution in which 7 5%o of the data values fall below and 25oh of the data values fall above. 7 5%o of the sampled clinkers have measurements below 260 mg/kg.
d.

IQR: Qu- Qr:260 - I l5 :

145

Descriptive Statistics

11

quartile' lnner fences are found 1.5 IQR's above the upper quartile and below the lower

9L- l.5IQR: ll5 - l.s(l4s):-102.5 cb + 1.5IQR :260 + t5(r4s): 477


f.

for this Since no clinkers were found beyond the inner fences, no outliers were detected
data set.

To construct a box plot, we first must make some preliminary calculations. The arranged median is the average of the middle two observations, once the data have been there are an evgn number of observations, 30' The average in ascending order. b"caure compute Qvand iF r""d l6mobservations is (9.97 + g9Syz:9.975. Next, we + : (l/4x30 + "iitt" lower quartile, h, is the data point with the rank of (l/4)(n l) The orr. lf : l.lS - 8. T'he 8,h ranked data point is 9.80. The upper quartile, pg, is the data is point with the rank of (3/4)(n+ 1): (3/4X30 + 1): 23.25 x23. T]ne23'd datapoint : fences ]O.OS. The interquartiie range, lQR, is Q" - Q": 10.05 - 9.90 .25. The inner The innerfencesare arelocated l.5OQR):I-5(t,: '375 below Qtandabove Qu'
9.80

+ .375 : lO-425' The outer fences are located 3(lQR) : 3(.25):.75 below Q1-andabove Qv. Theouterfences are 9.80 - -75:9.05 and 10'05 + .75:10.80. The box plot is shown below'

.375

:9.425

and 10.05

Outer

l-owd
Fenca

uper hid

Ferp

m-9-975

or--9t q - ro.os l'jacs

There are four highly suspect outliers in the data set: 8.05, 8.72,8.72, and 8.80' outside the outeriences. In addition, there is one suspect outlier: 10'55' It lies between the inner and outer fences.
b.

All lie

To use the z-score method to detect outliers, we must first calculate the mean and standard deviation for the data.

mean:

y=z'/ -29!:ll n30

=9.804

variance:

. s':

Zr,-{I4f n
n-l

zsst.B4r5-Qs!'lDz

.54O9

30

8.4851

:.2926

29

29

standard deviation:

t: J.Zg26 :

Chapter 2

Descri

I
I

The data point 8.05 has a z-score

or, : 8'05-

.5409

?:804 = *3.24

and the data point 8.72 hasa z-score o1,

=9Z-;'?,'!! : .5409

-2.00

These are the only data points with z-scores of 2 or more in absolute value. T'hese data

points, 8.05,8.72, and 8.72, are suspect outliers.


c.

The median of the new data is the average of the middle two observations, once the data have been arranged in ascending order, because there are an even number of observations, 30. Thi average of the l5s and l6th observations is (9.43 + 9.45)/2: 9.455. Next,wecompute QyandQu. The lowerquartile,Qt, isthedatapointwiththe rank of (l/4)(n + 1): (l/4)(30 + 1): 7.75 x 8. The 8'h ranked data point is 9.14. The

upperquartlle,Qv, isthedatapointwiththerank of (3/4)(n + 1): (3/4)(30 + l): 23.25 o23- The23'd datapoint is 9.75. The interquartile range, IQR, is Qu - Qt-- 9.75 - 9.14 : .61 . The inner fences are located 1.5(lQR) : 1.5(.61) : .91 5: below Qy and above Qu. The inner fences are9.l4 - .915 : 8.225 and9.75 + .915 .10.665. The outer fences are located 3(IQR):3(.61): 1.83 below Qyand above pg. The outer fences are 9.14 _ l.g3 :7.31 and9.75 + 1.93 : 11.59.
The box plot is shown as follows:

There are no points beyond the inner fences, so there are no outliers or suspected

outliers.
d.

mean: '30 v----'-'


varlance: s- .)L'n

282.67 :9.422
\-r

, v _(rd
2

n-1

=30

z6to.0613-?YSt30

6.650337
29

:.2293

-1

standarddeviation:

t:

JZZW :.4759
a z-score

The data point 8.51 (minimum data point) has

of

z _8.51-9.422:
.4789

_1.90

Descriptive Statistics

l9

point) has a z-score and the data point 10.12 (maximum data

of

z=

to.l2-9.422
.4789

=1.46

Nodatapointhasaz-scoreof2ormoreinabsolutevalue.Therearenooutliersor
suspected outliers.

First, compute the z-score for l '80:

unusual (less than 2'5%) to observe a If the data are actually mound-shaped, it would be very is zsx. Thus, if we did observe 1'8oz' we batch with I .Soyozinc phosphide if the true mean in today's production is probably would conclude that the mean percent of ziniphosphide
less than 2.0olo.
.55

': ':':

T#g

-2'5

a' il:ilr'H i'iini'fr*.'"';#ft'-bii^1i 9] 6y:)::::!:T TY:::t"";::'"*t13* wh'e iv""Itl: *:' T^lYji,*r


From the pie chart, more than *-[ee--r9u1ls

tuer'

;;r"';;;;;il;

n"w ptoa""ts'

The remaining 5vo are exported'

b.Toconvertthepiechartintoarelativefrequerrcybarchart,wefrrstneedtoconvertthe percents by looYo. The relative percents to ,"ruti\n.-a"q.r"*ies by dividing ttre


frequencY table is:

ffit.e for tuel Burnld


RecYcled BxPtrted

Tire

Fate

2.57

The

Per-ce-nt Relativelrgquency '776

lO3 6'7 S'0

'lO7 '067
'OSO

The relative frequency bar chart is:

f;

The

witt

Chapter 2

Descriptive

a.

To find tho frequencies for each of the oategories, we multiply the relative frequencies by the total sample size,242 million. The frequency table is:

Tire Fate
Dumped

Frequency (millions)
.776 .107 .067 .050 187.792

Relative

Frequency
25.894
16.214
12.100

Burned for fuel


Recycled

Exported

242.000 The frequency bar chart is:

F
E

E I
a

T r

Ou!19.d

Eun.d

?L.

Fd.

RfaCc.t

2.57

The Pareto diagram is:


,00
.9

.8

.78
F g

..F
.3o
.2
.1

.4E !

'H

FF 6-9
LE

8q

E6 c
5

.= o
at,

il
u,

b
E

*-!

8qE s ,E E
b8?
8 J
o n

gg
EF

E8 o6
EE

r't

$tE

Ea =

Poot Road condnion

The poor road condition that caused the most accidents was "road repairs/under construction" with 39 accidents. The poor road condition that caused the next most accidents was

Descriptive Statistics

21

..standingwater,,with25accidents.Thepoor.roadconditionthatcausedtheleastnumberof *u' ll*o* road surface" with 6 accidents'


accidents

2.5gTheEmpiricalRulestatesthatapproximatelyglo/'ofthedatawillfallwithinthe for the data set' find the foliowing summary information v + zsirrt"*uj. *"
Variable Rougtrness
N

Mean
1. BB10

SD

20

0.5239

+ 2s

1'8810 +2(O's239)

(0'8332' 29288)

z.6t

The z-score

is: " :

+ =ry#f

-1'06

Ahead-injuryratingof408islessthanthemeanhead-injuryrating.Itisalittlemorethan
the mean' one standirddeviation below
2.63

E
I

&.

Conr Cr$c B.d

qHtr Qlte 9dkn O0ter

qf.

Compound

CompoundC,H,hasthehighestrelativeabundance(.354),withcoumpoundCH:thenext all have relative H, C3H.1 ,C7FI15, C'oH,, , and others highest with .210. Compounds abundance less than '1'

2'65a.ThepopulationisthepercentageofironinallpossiblespecimensofChileanironore. b.onepossibleobjectiveofthesamplingprocedurewouldbetodeterminetheaverage p","""tuj" of irin in all possible specimens'


c.First,calculatetherangebysubtractingthesmallestobservationfromthelargest.
Range

: 64.34- 61'68 = 2'66

find the class Suppose we pick 8 classes' To Next, decide on the number of classes. of classes: divide the range by the number ;;;,;"
Class width

:2'66/8:

'3325

'34

Chapter 2

Desc{

,,)

will begin at 61.675, just below the smallest percentage. The resulting eight class intervals are shown as follows:
The first class Class

Class
I
2
J

Class

Interval Tallv
ilil

Relative Class Frequency Ffqqqpqgf


4 .061

4
5

- 62.015 62.Ot5 - 62.355 62.355 - 62.695 62.695 - 63.035


61.615 63.035 63.375

)nt
Ir*t
)r+t .I'ttl

rrr

.t2t
.091

lu
tttt

}ru

lrr

l8
t5
9
3

.273 .227

63.375

h( )r{ lttt
ill
ilt

-63.715
64.0s5 64.395

.r36
.o45 .045
66

7
8

64.055 63.7rs

' Totals n:

.999

For each class, count the number of observations. This gives the class frequency. The class relative frequency is then calculated by dividing the class frequency by the number of observations, t? : 66. For class l, the class relative frequency is 4/66: .061 . The other class relative frequencies are computed in a similar fashion and are listed in the table. The data above are then used to construct the relative frequency histogram.

.*-l

F '"]
LI

dt

"o-l

-"1
^i(H

'*l olr
I

d.

mean:

y: -! - 4157'59 :62.963 n66


sa2

- -,.:-,- s :- z' varlance: ,2 standard deviation:

-El'
n-7

l -

26 t 67 4.4g 83

(4 I 5 s' 5s)2

65 .6085

-24'0703 = 3io3
65

t: J.llU :

Descriptive Statistics

23

e.

y]';:s -62'6g3t2('6085)

- 62'963!l'217=(61'746'64'180)

Yes.Thepercentageofobservationsthatappeafinthisintervalis64/66x1to0Vo: g6.g7yo. T.his i, close to the 95%o given by the Empirical Rule'

2.67

From the computer, we find:

Variable
Scrams

s6

Mean

SD

o tt 3
Q

4.036

3 'O27

o o

To determine

if 1l unplanned

scrams is likely, we

will create a z'scote for the observation'

, !:olu ,:22 =" 3.O27 =2307 s


Mostoftheobservationswillfallwithintwostandarddeviationsofthemean.sincellhasa value to observe' z-score of 2.301,it is not a very likely found is were measured. The number of seabirds 2.69 a. For each transect, three variablestransect is also quantitative. whether or not the quanti;i;;-it i,i""gttr of the
transect was in an oiled area is qualitative' The experimental unit is the transect'

b. c.

A pie chart of the oiled and unoiled areas is:

2.71

d.

is: Using MINITAB, a scattergram of the data

ChapterZ

Descri

Frequency

81

65: 146

Proportion = '
d.

FrequencY

:.292 Total -146 500

Yes. Since no rod diameters were reported at the interval centered at 0.999, it appears these rods were being recorded at the higher diameter value.

Descriptive Statistics

21

The frequencY bar graPh is:

it does not appear that one cause is Because each of the bars are about the same height,
more likely to oacur than any other'
b.

o:L! 'n50

-2991 =59.82

-z s =--- n_l"

>r'

-ry:ft'477 -+50_l

13e,55s.38
49

= 2,848.068e8

: J"t

:.8,848.06898 =53-367

chebyshev's Rule to describe the Since the data are not mound-shaped, we must use :t -; :; ofthe observations will fall that at least F -b :r-

data' we know

within 3 standard deviations of the mean'

yx3s=59.82t3(53.367)=59.82+160.101=(_100.281'219.921). Atleast8/goftheobservationswillfallbetween-100'281and2|9'92|.

2.73

a.

The histogram portrays quantitative data' describe the data' A frequency distribution or histogram is being used to 1.0025 and 1'0045 centimeters can be The frequency of rods with diameters between

b. c.

approximatedbysummingthe.frequenciesassociatedwiththetwoclasses r.riozs-t.oo35 and 1.0035-l'0045.

Chapter2

Descrip

o E

5 (E o u)

1oo

taal

-rtrriiii*i.'. .
6

aaa a

rt

l-ergth

The mean density for the unoiled area is 3.27, while the mean for the oiled area is 3.495. The median for the unoiled area is .89 and is .70 for the oiled area. These are both fairly similar. Using Chebyshev's Theorem, at least 7syo of the observations will fall within 2 standard deviations of the mean. This interval for unoiled areas would be:

, +2s:*3.21 !2(6.7)-3.27 + 13.4 = (-10.13, 16-67) C.


Using Chebyshev's Theorem, at least 75%o of the observations will fall within 2 standard deviations of the mean. This interval for oiled areas would be:

V + 2s h.

3.495

2(5.968)

3.495 + I 1.936

- (-8.441, I 5'43 I )

From the above two intervals, we know that at least 7 5o/o of the observations for the unoiledareawillfallbetween-10.3 I and16.67 andatmost2iyooftheobservations will fall above 15.431 for the oiled areas. Thus, the unoiled areas would be more likely to have a seabird density of 16. We will use a frequency bar graph to describe the data. First, we must add up the number of spills under each category. These values are summarized in the following
table: Cause of

2.11

Spillage

Collision Grounding Fire/Explosion Hull Failure Unknown Total

Frequency l1 t3
T2

t2

50

Descriptive Statistics

25

You might also like