You are on page 1of 22

Statistics I

MTH160

Chapter 2

Descriptive Analysis And Presentation of Single-Variable Data

2.2 2.3 2.4 2.5 2.6 2.7 2.8

Graphs, Pareto Diagrams, (and Stem-and-Leaf Displays) Frequency Distribution and Histograms Measures of Central Tendency Measures of Dispersion Measures of Position Interpreting and Understanding Standard Deviation The Art of Statistical Deception (to read)

MTH 160 Brigitte Martineau

Statistics I Chapter 2

2.2 Graphs, Pareto Diagrams, (and Stem-and-Leaf Displays)


Why graphing?

Graphs for qualitative data: Circle graph and Bar graph 1. Circle Graph (also known as _________ _____________ )

Pie Chart of Smokers? Category n y

y 14, 25.9%

n 40, 74.1%

In Minitab: Graph > Pie Chart > select a variable.you can also select various options within the pie chart menu. Page 2 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

2. Bar Graph

Chart of Living Arrangements

40

30 Count

20

10

2 3 Living Arrangements

1 - Living with Parents 2 - Living with Others

3- Living with Spouse 4- Living Alone

In Minitab: Graph > Bar Chart > select the type > OK > select a variable.you can also select various options within the bar chart menu. Page 3 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

Pareto Diagram (a special type of bar graph) Example: The final daily inspection defect report for a cabinet manufacturer is given in the table below: Defect Number Defect Number Dent 5 Chip 25 Stain 12 Scratch 40 Blemish 43 Others 10 Management has given the cabinet production line the goal for reducing their defects by 50%. What defects should they give special attention to in working toward this goal?

Daily Defect Inspection Report


140 120 80 100 80 Count 60 40 20 0
Defec Count Percent Cum% Blemi Scrat 40 29.6 61.5 Chi 25 18.5 80.0 Stai 12 8.9 88.9 Othe 10 7.4 96.3 De 5 3.7 100.0

100

60 Percent 40 20 0

43 31.9 31.9

In Minitab: Stat > Quality tools > Pareto Chart > select a variable Page 4 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

Graphs for quantitative data: dotplots (and stem-and-leaf) Graphs for quantitative data are useful to display the distribution. But what is a distribution?

1. Dotplot

Dotplot of Height

Dotplot of Height vs Gender

Gender

f m

60

63

66

69 Height

72

75

78

60

63

66

69 Height

72

75

78

In Minitab: Graph > Dotplot > select a variable

Page 5 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

2.3 Frequency Distribution and Histograms


Frequency distributions and histograms are used to summarize large data sets. What is a frequency distribution?

Grouped versus Ungrouped: Example of an ungrouped frequency distribution Variable: # of kids Data: 2 2 0 0 0 1

Examples of a grouped frequency distribution Grouping Rules: Procedure for Grouping Page 6 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

Example: A video store has computed the number of movies rented for every day of the last month: 74 142 179 127 198 105 98 87 189 154 189 207 76 95 108 163 205 96 149 174 123 147 108 101 185 125 87 119 138 162

Classes

Frequency

Midpoint

Total What is a histogram? Page 7 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

Histograms can have different shapes:

Symmetrical

Uniform (rectangular)

Skewed to right

Skewed to left

J-Shaped

Bimodal

Page 8 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

2.4 Measures of Central Tendency


Some examples of measures of Central tendency are:

What do they measure?

What is an average?

THE MEAN

Symbol for the mean of the population: Symbol for the mean of the sample: Formula:

x
n

= ____________________

What is the mean of: 10, 20, 30?

What is the mean of 10, 20, 300?

What happened?

Example: You scored 70, 80, 65 and 90 on your 4 first tests. What score do you need on your fifth test in order to have a mean of 75?

Page 9 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

THE MEDIAN What is it?

Symbol How to find the median? 1. 2.

3.

Examples: Find the median of 4, 8, 3, 8, 2, 9, 2, 11, 3

Find the median of 4, 8, 3, 8, 2, 9, 2, 11, 3, 15

THE MODE What is it? Symbol Bimodal or no mode Example: Page 10 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

THE MIDRANGE What is it? Symbol Formula:

midrange =

lowest value + highest value 2

Examples Consider the following 2 sets of data. Calculate the Averages for each. Data Set Mean Median Mode Midrange Best Measure

1, 2, 3, 4, 90

5, 10, 15, 20, 50, 10, 20, 30, 40, 10

The results of 3 tests of MTH 160 are shown in the following table. Explain what is going on? Mean Median Explanation

Test 1

74

73

Test 2

73

80

Test 3

65

60

Page 11 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

2.5 - Measures of Dispersion


Everyday you buy a can of Sprite. Does it mean that you drink the EXACT same quantity of Sprite everyday? WHY? Can you think about any experiment or action in life where there is absolutely no variability? Variability can be found _________________ Measures of Dispersion are used to measure the ______________________________ What are the common measures of dispersion? Why do we need measures of dispersion?

Small variation: Large variation:

RANGE What does the range measure? Formula:

Range = Highest Value - Lowest Value

VARIANCE Variance of a population s2 =

Variance of a sample

(X X )
n 1

Formula:

STANDARD DEVIATION What is the standard deviation?

Standard deviation of a population

Standard deviation of a sample

(X X )
Formula: s =

n 1
Page 12 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

Example Find the range, the variance and the standard deviation of the following set of numbers Range = Variance = Standard Deviation = Step 1: Find the mean X

Step 2: Fill in the table below: Data Set 2 5 4 8 4 3 6 8

x X

x X

SHORTCUT FORMULA

s
2

x
n 1 n

Sum = Sum =

(X X )

s =

- 1

Page 13 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

Find the range, the variance and the standard deviation of the following data: 8, 8, 12, 14, 6, 6 Range: Data Set Variance: Standard Deviation:

x X

x X

SHORTCUT FORMULA

s
2

x
n 1 n

Sum =

Sum =

(X X )

s2

n 1

n 1

The results of 2 tests of MTH 160 are shown in the following table. Explain. Mean Standard Deviation Explanation

Test 1

74

Test 2

73

22

Page 14 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

2.6 - Measures of Position


Measures of position are used to: Most popular measures of position are: QUARTILES

PERCENTILES

Relationship between quartiles and percentiles: 1st quartile is equivalent to 2nd quartile is equivalent to 3rd quartile is equivalent to Page 15 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

How to find percentiles and quartiles 1. Rank the data in ascending order nk 2. Compute A 100 3. If A is an integer: The position of the percentile is at A + 0.5 = A.5 The percentile is halfway between the value of the data in the Ath position and the value of the next data. If A is a fraction or a decimal The position of the percentile is at the next larger integer after A. The percentile is the value of the data at that position mentioned above. Examples The following data represents the pH levels of a random sample of swimming pools in a California town 5.6 5.6 5.8 5.9 6.0 6.0 6.1 6.2 6.3 6.4 6.7 6.8 6.8 6.8 6.9 7.0 7.3 7.4 7.4 7.5 Find the 34th and the 60th percentile as well as the 1st and 3rd quartile.

Page 16 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

THE MIDQUARTILE The midquartile is another measure of ________________________________ Formula: The mean, median, mode, midrange and midquartile are all measures of central tendency. Are they all equal in value? Can you find an example where they would be? 5-NUMBER SUMMARY The 5-number summary indicates how much the data are spread in each _______________ 3. 1. 4. 2. 5. The Box-and-Whisker display, also called boxplot, displays the 5-number summary.

Vertical or horizontal The box is used to depict.

The whiskers are line segments used to depict.. The line through the box represents ______________________ One line segment represents

The other line segment represents


Boxplot of Shoe Size
15.0

The outlier.
Shoe Size

12.5

10.0

7.5

5.0

Page 17 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

EXAMPLE A random sample of students in a sixth grade class was selected. Their weights are given in the table below. Find the 5-number summary for this data and construct a boxplot. 63 93 64 93 76 93 76 94 81 97 83 99 85 99 86 99 88 101 89 108 90 109 91 112 92

Z-SCORE (also called standard score) What is a z-score?

Formula:

Be careful: 1. The calculated value of z is rounded to the nearest ____________ 2. The z-score measures the number of ____________ ____________ above or below the ______________. 3. z-scores range from _____________ to _____________ 4. z-scores may be used to make comparisons of ______ _____________. Page 18 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

EXAMPLES A certain data set has mean 76 and standard deviation 10. Find the z-scores for 90 and 60.

Bill and Joe both got 79% on their statistics test. Bill is in section 1 where the mean was 75 and the standard deviation was 10. Joe is in section 5 where the mean was 77 and standard deviation was 21. Who has the best relative score?

Page 19 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

2.7 - Interpreting and Understanding Standard Deviation


Standard deviation is a measure of ______________________ There are 2 rules to describe data that rely on the standard deviation: 1. 2. EMPIRICAL RULE If a variable is normally distributed: 1. Approximately _____ % of the data lie within ___ standard deviation of the mean 2. Approximately _____ % of the data lie within __ standard deviations of the mean 3. Approximately _____ % of the data lie within __ standard deviations of the mean Note:

99.7% 95% 68%

x 3s

x 2s

xs

xs

x 2s

x 3s

Page 20 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

EXAMPLE A random sample of plum tomatoes was selected from a local grocery store and their weights recorded. The mean weight was 6.5 ounces with a standard deviation of 0.4 ounces. If the weights are normally distributed: a) What percentage of weights falls between 5.7 and 7.3? b) What percentage of weights falls above 7.7?

The Empirical rule can be used to find out whether or not a distribution is approximately normal. 1. Find the _________ and the ________________ ________________ 2. Compute the actual proportion of data within 1, 2 and 3 standard deviations of the mean.

3. Compare with the empirical rule

4. If the proportions found are reasonably close to those of the empirical rule, then the data are approximately normally distributed.

Page 21 of 22

MTH 160 Brigitte Martineau

Statistics I Chapter 2

CHEBYSHEVS THEOREM The proportion of any distribution that lies within k standard deviations of the mean is at least 1 1 2 , where k is any positive number larger than 1. This theorem applies to ALL distributions k of data. Notes: 1. 2. 3.

Illustration:

at least 1 12 k

x ks

x ks

EXAMPLE At the close of trading, a random sample of 35 technology stocks was selected. The mean selling price was 67.75 and the standard deviation was 12.3. Use Chebyshevs theorem (with k = 2, 3) to describe the distribution.

Page 22 of 22