You are on page 1of 31

Chapter 2

Organizing and Summarizing


Data
Raw Data
 Data recorded in the sequence in which they are
collected and before they are processed or ranked are
called raw data.

2
Organizing and Graphing –
Categorical Data
O Data are usually organized in the form of a
frequency table which shows the counts
(frequencies) of individual categories.
O Our understanding of the data is further
enhanced by calculation of proportion
(relative frequency) of observations in each
category.
Relative frequency = Frequency in the category
Total number of observations.
The data below represent the color of M&Ms in a
bag.
brown, brown, yellow, red, red, red, brown, orange,
blue, green, blue, brown, yellow, yellow, brown, red,
red, brown, brown, brown, green, blue, green,
orange, orange, yellow, yellow, yellow, red, brown,
red, brown, orange, green, red, brown, yellow,
orange, red, green, yellow, yellow, brown, yellow,
orange
Tabulate the results and calculate the relative
frequencies for the six categories color.
2-4
Relative Frequency

12
 0.2667
45
0.2222

0.2

0.1333

0.0667

0.1111 2-5
Graphical Presentation
O A bar graph is constructed by labeling
each category of data on either the
horizontal or vertical axis and the
frequency or relative frequency of the
category on the other axis.
O A pie chart is a circle divided into sectors.
Each sector represents a category of data.
The area of each sector is proportional to
the frequency of the category.

2-6
2-7
2-8
Bar Chart for M&M Color

Blue; 3 Green; 5

Brown; 12

Orange; 6

Yellow; 10
Red; 9
Organizing and Graphing Data
:Quantitative Data
O Small sample – if the sample size is small (n <
30)
O Sort the data in ascending order: x(1) ≤ x(2) ≤
· · · ≤ x(n)
O Graph the data
O Calculate measures (see next lecture)
Example
O We measured the
quantity of fat in 15 Dot Plot
sample of milk (in g/l): O Fat
14.85 14.68 15.27 14.77
14.83 14,95 15,08 15,02
15.07 14.98 15.15 15.49
14.83 14.95 14.78
Stem-and-Leaf Plot
O Uses place value to organize data

O Shows data in an organized way so it can be


analyzed easily

O Organizes data so it is easier to find the median,


mode, and range

O Stem-and-Leaf Plots: A convenient method to


display every piece of data by showing the
digits of each number.
Stem-and-Leaf Plot
How to Draw One:

1. Put the first digits of each piece of data in numerical order down
the left-hand side

2. Go through each piece of data in turn and put the remaining digits in
the proper row

3. Re-draw the diagram putting the pieces of data in the right order

4. Add a key
Here are the scores for a freshman basketball team

O Here is the same data organized into a stem-and-leaf plot.


Can you tell how the
stem-and-leaf plot was made?

The first number


in the data is 27.

Here is the 2 in Here is the 7 in


the tens place. the ones place.
The key shows us
which place value
the digits represent.
The stems all
represent tens
place in this
stem-and-leaf
plot. The leaves all
represent ones.
Organizing and Graphing Data
:Quantitative Data
O n > 30 Discrete data
O Frequency table (ni , pi , Ni , Fi , i = 1, 2, . . . ,
k, k is the number of variants)
O Graph the data – line plot, histogram, box plot.
O Calculate measures (next lecture)
The following data represent the number of available
cars in a household based on a random sample of 50
households. Construct a frequency and relative
frequency distribution.

3 0 1 2 1 1 1 2 0 2
4 2 2 2 1 2 2 0 2 4
1 1 3 2 4 1 2 1 2 2
3 3 2 1 2 2 0 3 2 2
2 3 2 1 2 2 1 1 3 5
Organizing and Graphing Data
:Quantitative Data
n > 30, Continuous data
OConstruct classes (the number, the width and the begin)
OFrequency table
OGraph the data – histogram, box plot, empirical distribution
function
OCalculate measures (see next lecture)
Calculation of classes
Here are some race times in seconds from
the downhill racing event.

The times range from about 85 to about 110


seconds
110 – 85 = 25 seconds.
Suppose we decide to use class intervals with a width of 5
seconds. 25 ÷ 5 = 5 class intervals
Time in seconds Frequency
85 ≤ t < 90 1

90 ≤ t < 95 5
95 ≤ t < 100 28
100 ≤ t < 105 19
105 ≤ t < 110 7
30
25
Frequency

20
15
10

5
0
80 85 90 95 100 105
Times in seconds
Changing the class interval
When the class intervals are changed, the same data
produces the following graph:

20

15
Frequency

10

0
85 87.5 90 92.5 95 97.5 100 102.5 105 107.5
Times in seconds
Frequency poligon
Times in seconds Midpoint 30
85 ≤ t < 90 87.5 25

Frequency
90 ≤ t < 95 92.5 20
95 ≤ t < 100 97.5 15
100 ≤ t < 105 102.5 10
105 ≤ t < 110 107.5 5
0
75 80 85 90 95 100 105 110
Times in seconds
Identify the shape of a distribution
Example: Describe the shape of the distribution.

(a) Skewed left (b) Skewed right (c) Symmetric (d) not sure

You might also like