Professional Documents
Culture Documents
GRADUATE SCHOOL
Example:
A sample of 30 employees from large companies was selected, and these employees were asked how
stressful their jobs were. The responses of these employees are recorded below, where very represents
very stressful, somewhat means somewhat stressful, and none stands for not stressful at all.
Solution Note that the variable in this example is how stressful is an employee’s job. This variable is
classified into three categories: very stressful, somewhat stressful, and not stressful at all. We record
these categories in the first column of Table 2.4. Then we read each employee’s response from the given
data and mark a tally, denoted by the symbol 0, in the second column of Table 2.4 next to the
corresponding category. For example, the first employee’s response is that his or her job is somewhat
stressful. We show this in the frequency table by marking a tally in the second column next to the
category somewhat. Note that the tallies are marked in blocks of five for counting convenience. Finally,
we record the total of the tallies for each category in the third column of the table. This column is called
the column of frequencies and is usually denoted by f. The sum of the entries in the frequency column
gives the sample size or total frequency. In Table 2.4, this total is 30, which is the sample size.
Example:
Determine the relative frequency and percentage distributions for the data of Table 1
Solution The relative frequencies and percentages from Table 2.4 are calculated and listed in Table 2.5.
Based on this table, we can state that .333, or 33.3%, of the employees said that their jobs are very
stressful. By adding the percentages for the first two categories, we can state that 80% of the employees
said that their jobs are very or somewhat stressful. The other numbers in Table 2.5 can be interpreted
the same way. Notice that the sum of the relative frequencies is always 1.00 (or approximately 1.00 if
the relative frequencies are rounded), and the sum of the percentages is always 100 (or approximately
100 if the percentages are rounded).
1. Bar Graph
- A graph made of bars whose heights represent the frequencies of
respective categories.
- Instead of frequencies a bar graph might display the relative
frequencies or percentages of the categories.
Example:
2. Pie Chart
- A circle divided into portions that represent the relative frequencies or percentages
of a population or a sample belonging to different categories
- The size of the slice representing a particular category is proportional to the
corresponding frequency (relative frequency) that fall within this category.
- slice size=category relative frequency · 360
Example:
Stress on Job
None Somewhat
Very
20%
33%
47%
Class Boundary
- The class boundary is given by the midpoint of the upper limit of one class and the
lower limit of the next class.
Class Width
- Or Class Size
- The difference between the two boundaries of a class
- Class width = Upper boundary - Lower boundary
- 1000.5 - 800.5 = 200
- is obtained by dividing the sum of the two limits (or the two boundaries) of a class
by 2.
- Class midpoint or mark =
- Lower limit + Upper limit
- 2
- 801 + 1000
- 2
- = 900.5
Example:
The following data give the total number of iPods
sold by a mail order company on each of 30 days. Construct a frequency distribution table.
23 14 19 23 20 16 27 16 21 14
The minimum value is 5, and the maximum value is 29. Suppose we decide to group these data using
five classes of equal width. Then,
Now we round this approximate width to a convenient number, say 5.The lower limit of the
first class can be taken as 5 or any number less than 5. Suppose we take 5 as the lower limit of the first
class. Then our classes will be 5 – 9, 10 – 14, 15 – 19, 20 – 24, and 25 – 29
Now we read each value from the given data and mark a tally in the second column of Table 2.9 next to
the corresponding class. The first value in our original data set is 8, which belongs to the 5–9 class. To
record it, we mark a tally in the second column next to the 5–9 class. We continue this process until all
the data values have been read and entered in the tally column. Note that tallies are marked in blocks of
five for counting convenience. After the tally column is completed, we count the tally marks for each
class and write those numbers in the third column. This gives the column of frequencies. These
frequencies represent the number of days on which iPods indicated in classes are sold. For example, on
8 of 30 days, 15 to 19 iPods were sold.
Calculating Relative
Frequency and Percentage
Calculate the relative frequencies and percentages for Table 2.9. Solution The relative frequencies and
percentages for the data in Table 2.9 are calculated and listed in the third and fourth columns,
respectively, of Table 2.10. Note that the class boundaries are listed in the second column of Table 2
Using Table 2.10, we can make statements about the percentage of days with iPods sold within a certain
interval. For example, on 20% of the days, 10 to 14 iPods were sold. By adding the percentages for the
first two classes, we can state that 5 to 14 iPods were sold on 30% of the days. Similarly, by adding the
percentages of the last two classes, we can state that 20 to 29 iPods were sold on 43.4% of the days.
2. Frequency Polygons
3. Cumulative Frequency
4. Ogive
6. Dot Plots
(quantitative) data can be displayed in a histogram or a polygon. This section describes how to construct
such graphs. We can also draw a pie chart to display the percentage distribution for a quantitative data
set. The procedure to construct a pie chart is similar to the one for qualitative data explained in Section
2.2.3; it will not be repeated in this section.
1. Histogram
- is a graph in which classes are marked on the horizontal axis and the
frequencies, relative frequencies, or percentages are marked on the vertical axis.
The frequencies, relative frequencies, or percentages are represented by the heights
of the bars. In a histogram, the bars are drawn adjacent to each other.
Example:
2. Polygons
- A graph formed by joining the midpoints of the tops of successive bars in a
histogram with straight lines.
- A polygon with relative frequencies marked on the vertical axis is called a relative
frequency polygon. Similarly, a polygon with percentages marked on the vertical
axis is called a percentage polygon
Example:
Example:
Using the frequency distribution of Table 2.9, reproduced here, prepare a cumulative frequency
distribution for the number of iPods sold by that company.
4. Ogive
- An ogive is a curve drawn for the cumulative frequency distribution by joining
with straight lines the dots marked above the upper boundaries of classes at heights
equal to the cumulative frequencies of respective classes.
Example:
When plotted on a diagram, the cumulative frequencies give a curve that is called an ogive (pronounced
o-jive ). Figure 2.12 gives an ogive for the cumulative frequency distribution of Table 2.14. To draw the
ogive in Figure 2.12, the variable, which is total iPods sold, is marked on the horizontal axis and the
cumulative frequencies on the vertical axis. Then the dots are marked above the upper boundaries of
various classes at the heights equal to the corresponding cumulative frequencies. The ogive is obtained
by joining consecutive points with straight lines. Note that the ogive starts at the lower boundary of the
first class and ends at the upper boundary of the last class.
One advantage of an ogive is that it can be used to approximate the cumulative frequency for any
interval. For example, we can use Figure 2.12 to find the number of days for which 17 or fewer iPods
were sold. First, draw a vertical line from 17 on the horizontal axis up to the ogive. Then draw a
horizontal line from the point where this line intersects the ogive to the vertical axis. This point gives the
cumulative frequency of the class 5 to 17. In Figure 2.12, this cumulative frequency is (approximately) 13
as shown by the dashed line. Therefore, 17 or fewer iPods were sold on 13 days. We can draw an ogive
for cumulative relative frequency and cumulative percentage distributions the same way as we did for
the cumulative frequency distribution
Example:
The following are the scores of 30 college students on a statistics test.
75 52 80 96 65 79 71 87 93 95
69 72 81 61 76 86 79 68 50 92
83 84 77 64 71 87 72 92 57 98
To construct a stem-and-leaf display for these scores, we split each score into two parts. The first part
contains the first digit, which is called the stem. The second part contains the second digit, which is
called the leaf. We observe from the data that the stems for all scores are 5, 6, 7, 8, and 9 because all
the scores lie in the range 50 to 98
After we have listed the stems, we read the leaves for all scores and record them next to the
corresponding stems on the right side of the vertical line. The complete stem-and-leaf display for scores
is shown
6. Dot Plots
- Outliers or Extreme Values - Values that are very small or very large relative to the
majority of the values in a data set are called outliers or extreme values
Example:
The table lists the lengths of the longest field goals (in yards) made by all kickers in the American
Football Conference (AFC) of the National Football League (NFL) during the 2008 season. Create a dot
plot for these data.
Step 1. The minimum and maximum values in this data set are 26 and 57 yards, respectively. First, we
draw a horizontal line (let us call this the numbers line) with numbers that cover the given data as
shown in the line. Note that the numbers line shows the values from 25 to 57.
Step 2. Place a dot above the value on the numbers line that represents each distance listed in the table.
For example, S. Hauschka’s longest successful field goal of the 2008 season was 54 yards. Place a dot
above 54 on the numbers line as shown in the number line. If there are two or more observations with
the same value, we stack dots vertically above each other to represent those values. For example 53
yards was the distance of the
longest field goals made by four players. We stack four dots (one for each player) above 53 on the
numbers line
Dot plots are also very useful for comparing two or more data sets. To do so, we create a dot
plot for each data set with numbers lines for all data sets on the same scale. We place these data sets on
top of each other, resulting in what are called STACKED DOT PLOTS. Example 2shows this procedure.
Example 2:
Refer to Table , which gives the distances of longest completed field goals for all kickers
in the AFC during the 2008 NFL season. Table 2.17 provides the same information for the kickers in the
National Football Conference (NFC) of the NFL for the 2008 season. Make dot plots for both sets of data
and compare these two dot plots.
Reference:
https://onlinestatbook.com/2/graphing_distributions/graphing_distributions.html
https://k101.unob.cz/~neubauer/pdf/Introductory%20Statistics-Mann.pdf
https://www.academia.edu/9547067/ORGANIZING_AND_GRAPHING_DATA
https://k101.unob.cz/~neubauer/pdf/descriptive_statistics1.pdf
https://academic.macewan.ca/burok/Stat141/notes/organize.pdf
https://www.math.arizona.edu/~jwatkins/statbook.pdf