CHAPTER 2

TABULAR Summarizing qualitative data

&

Exploratory data analysis: the steam-and-leaf display

GRAPHICAL METHODS

Crosstabulation and Scatter diagrams

DATA FROM A SAMPLE OF 50 SOFTDRINK PURCHASES

Coke Classic Sprite Pepsi-Cola

Frequency Distribution Diet Coke Coke Classic Coke Classic

Pepsi-Cola Diet Coke Coke Classic

Diet Coke Coke Classic Coke Classic

A frequency distribution is a tabular summary a set Coke Classic Diet Coke Pepsi-Cola

Coke Classic Coke Classic Dr.Pepper

Dr.Pepper Sprite Coke Classic

of data showing the frequency (or number) of items Diet Coke Pepsi-Cola Diet Coke

Pepsi-Cola Coke Classic Pepsi-Cola

Pepsi-Cola Coke Classic Pepsi-Cola

Coke Classic Coke Classic Pepsi-Cola

in each of several non-overlapping classes. Dr.Pepper Pepsi-Cola Pepsi-Cola

Sprite Coke Classic Coke Classic

Coke Classic Sprite Dr.Pepper

Diet Coke Dr.Pepper Pepsi-Cola

Coke Classic Pepsi-Cola Sprite

Coke Classic Diet Coke

1

SUMMARIZING QUALITATIVE DATA SUMMARIZING QUALITATIVE DATA

FREQUENCY DISTRIBUTION OF SOFT DRINK PURCHASES Relative Frequency and Percent Frequency

Distribution

Soft Drink Frequency Relative frequency distribution: A tabular summary

Coke Classic 19 of a set of data showing the relative frequency –

Diet Coke 8 that is, the fraction or proportion – of the total

Dr.Pepper 5 number of items in each of several non-overlapping

classes.

Pepsi-Cola 13

Relative frequency of a class = frequency of the class

Sprite 5

Total 50 n

Percent frequency = relative frequency * 100

DISTRIBUTIONS OF SOFT DRINK PURCHASES

Distribution (cont.) Soft drink Relative Frequency Percent Frequency

Percent frequency distribution: A tabular summary Coke Classic .38 38

Diet Coke .16 16

of a set of data showing the percentage of total Dr.Pepper .10 10

number of items in each of several non-overlapping Peppsi-Cola .26 26

Sprite .10 10

classes.

Total 1.00 100

2

SUMMARIZING QUALITATIVE DATA SUMMARIZING QUALITATIVE DATA

Bar Graphs and Pie Charts Bar Graphs and Pie Charts (cont.)

Bar Graph Pie chart

BAR GRAPH OF SOFT DRINK PURCHASES PIE CHART OF SOFT DRINK PURCHASES

20

18

16 Coke

Frequency

14 Classic

12 38%

10

Diet Coke

8

6 16%

4

2 Sprite

10% Dr. Pepper

0 10%

Coke Diet Dr. Pepsi- Sprite

Classic Coke Pepper Cola Pepsi- Cola

26%

Soft Drink

Construction a Frequency Distribution

A frequency distribution is a tabular summary of a

⌧Gather the sample data

set of data showing the frequency (or number) of ⌧Determine the number of non-overlapping classes

⌧Determine the width of each class

items in each of several non-overlapping classes. ⌧Determine the class limits

⌧Count the number of data values in each class

⌧Summarize the class frequencies in a frequency

distribution table

3

SUMMARIZING QUANTITATIVE DATA SUMMARIZING QUANTITATIVE DATA

Number of classes (K): 5 ≤ K ≤ 20 Class boundaries

Class width

largest data value - smallest data value ⌧Class boundaries are the dividing lines between the

⌧class width =

number of classes classes.

Class limits

⌧Class limits are the largest and smallest number Class Midpoint (Class Mark)

could belong to a given class. ⌧The class midpoint is the value halfway between the

• Lower class limit = Smallest number

lower and upper class limits.

• Upper class limit = Largest number

The difference between the lower class limit of adjacent

classes provides the class width.

YEAR-END AUDIT TIMES (IN DAYS) FREQUENCY DISTRIBUTION FOR THE AUDIT-TIME DATA

(days)

15 15 18 17

10-14 4

20 27 22 23

15-19 8

22 21 33 28

20-24 5

14 18 16 13

25-29 2

30-34 1

Total 20

4

SUMMARIZING QUANTITATIVE DATA SUMMARIZING QUANTITATIVE DATA

Relative Frequency and Percent Frequency DISTRIBUTION FOR THE AUDIT-TIME DATA

Audit Time Relative Frequency Percent Frequency

Distribution (days)

frequency of the class 10-14 .20 20

Relative frequency of a class =

n 15-19 .40 40

Percent frequency = Relative frequency x 100 20-24 .25 25

25-29 .10 10

30-34 .05 5

Total 1.00 100

SUMMARIZING QUANTITATIVE

DATA SUMMARIZING QUANTITATIVE DATA

Dot Plot

Histogram

A horizontal axis shows the range of values for the data. A histogram is constructed by placing the variable

Each data value is represented by a dot placed above of interest on the horizontal axis and the frequency,

relative frequency or percent frequency on vertical

4 the axis.

3 axis.

2

1 Histogram describes a shape of data.

0

10 15 20 25 30 35

Audit time in days

5

SUMMARIZING QUANTITATIVE DATA SUMMARIZING QUANTITATIVE DATA

9

8

Cumulative Distributions

7

6 Cumulative frequency distribution shows the

5

44

number of items less than or equal to the upper

33

class limit of each class.

22

11

00

0 5 10 15 20 25 30 35

5 10 15 20 25 30 35

Audit time in days

SUMMARIZING QUANTITATIVE

SUMMARIZING QUANTITATIVE DATA DATA

FREQUENCY, AND CUMULATIVE PERCENT FREQUENCY Ogive

DISTRIBUTIONS FOR THE AUDIT-TIME DATA

Ogive is graph of cumulative distribution.

Audit Time (days) Cumulative Cumulative Cumulative 25

Frequency Relative Frequency Percent Frequency 20

Less than or equal to 19 12 .60 60 10

Less than or equal to 24 17 .85 85

5

Less than or equal to 29 19 .95 95

Less than or equal to 34 20 1.00 100 0

5 10 15 20 25 30 35

Audit Time in Days

6

EXPLORATORY DATA ANALYSIS: EXPLORATORY DATA ANALYSIS:

THE STEM-AND-LEAF DISPLAY THE STEAM-AND-LEAF DISPLAY

NUMBER OF QUESTIONS ANSWERED CORRECTLY

ON AN APTITUDE TEST

Stem-and-leaf display: An exploratory data 112 72 69 97 107

73 92 76 86 73

analysis technique that simultaneously rank 126 128 118 127 124

82 104 132 134 83

orders quantitative data and provides insight 92 108 96 100 92

115 76 91 102 81

about the shape of the distribution. 95 141 81 80 106

84 119 113 98 75

38 98 115 106 95

100 85 94 106 119

THE STEM-AND-LEAF DISPLAY SCATTER DIAGRAMS

6 8 9 Crosstabulation

7 2 3 3 5 6 6

Crosstabulation is a tabular summary of data for two

8 0 1 1 2 3 4 5 6 variables. The classes for one variable are presented by the

9 1 2 2 2 4 5 5 6 7 8 8 rows, the classes for the other variable are presented by the

10 0 0 2 4 6 6 6 7 8 columns.

11 2 3 5 5 8 9 9 Crosstabulation is widely used for examining the relationship

12 4 6 7 8 between two variables

13 2 4

14 1

7

CROSSTABULATIONS AND CROSSTABULATIONS AND

SCATTER DIAGRAMS SCATTER DIAGRAMS

CROSSTABULATION OF QUALITY RATING AND MEAL ROW PERCENTAGES FOR EACH RATING CATEGORY

PRICE FOR 300 LOSANGELES RESTAURANTS Meal Price

Meal Price Quality Rating $10-19 $20-29 $30-39 $40-49 Total

Quality Rating $10-19 $20-29 $30-39 $40-49 Total Good 50.0 47.6 2.4 0.0 100

Good 42 40 2 0 84 Very good 22.7 42.7 30.6 4.0 100

Very good 34 64 46 6 150 Excellent 3.0 21.2 42.4 33.4 100

Excellent 2 14 28 22 66

Total 78 118 76 28 300

SCATTER DIAGRAMS SCATTER DIAGRAMS

SAMPLE DATA FOR THE STEREO AND SOUND EQUIPMENT STORE

Scatter Diagram Week No. of Sales Volume

Commercials ($100s)

A scatter diagram is a graphical presentation of the x y

1 2 50

relationship between two quantitative variables. 2 5 57

3 1 41

One variable is shown on the horizontal axis and 4 3 54

5 4 54

the other variable is shown on the vertical axis. 6 1 38

7 5 63

8 3 48

9 4 59

10 2 46

8

CROSSTABULATIONS AND CROSSTABULATIONS AND

SCATTER DIAGRAMS SCATTER DIAGRAMS

Scatter diagram for the stereo and sound equipment store Types of relationships depicted by scatter diagrams

65

60

Sales

($10 55

0s)

50

45

40

35

0 1 2 3 4 5 6 A Positive Relationship No Apparent Relationship A Negative Relationship

Number of commercials

SUMMARIZING DATA

Data

Qualitative Quantitative

data data

Methods Methods Methods Methods

•Relative Frequency •Histogram

•Relative Frequency •Pie chart Distribution

distribution •Cummulative frequency •Ogive

distribution

•Percent Frequency •Scatter

distribution •Cummulative Relative

Frequency distribution •Diagram

•Crosstabulation •Stem-and-Leaf Display

•Crosstabulation

