You are on page 1of 13

Republic of the Philippines

RAMON MAGSAYSAY TECHNOLOGICAL UNIVERSITY


Iba, Zambales

GRADUATE SCHOOL

Statistics for Research 3:00-6:00 PM January 4, 2022


Name __________Francheska G. Alviz__________ Course MAED
EA

Organizing and Graphing Data


WRITTEN REPORT
Organizing and Graphing Data
1. Qualitative or Categorical Variables
2. Quantitative or Numerical Variables
Raw Data - Data recorded in the sequence in which they are collected and before they are
processed or ranked are called raw data
Organizing and Graphing Qualitative Data
 Frequency Distribution for Qualitative Data
- A frequency distribution for qualitative data lists all categories and the number of elements
that belong to each of the categories.

- It exhibits how the frequencies are distributed over various categories.


A sample of 100 students enrolled at a university were asked what they intended to do after graduation.
Forty-four said they wanted to work for private companies/businesses, 16 said they wanted to work for
the federal government, 23 wanted to work for state or local governments, and 17 intended to start
their own businesses. Table 2.3 lists the types of employment and the number of students who intend to
engage in each type of employment. In this table, the variable is the type of employment, which is a
qualitative variable. The categories (representing the type of employment) listed in the first column are
mutually exclusive. In other words, each of the 100 students belongs to one and only one of these
categories. The number of students who belong to a certain category is called the frequency of that
category. A frequency distribution exhibits how the frequencies are distributed over various categories.
Table 2.3 is called a frequency distribution table or simply a frequency table.

Example:
A sample of 30 employees from large companies was selected, and these employees were asked how
stressful their jobs were. The responses of these employees are recorded below, where very represents
very stressful, somewhat means somewhat stressful, and none stands for not stressful at all.
Solution Note that the variable in this example is how stressful is an employee’s job. This variable is
classified into three categories: very stressful, somewhat stressful, and not stressful at all. We record
these categories in the first column of Table 2.4. Then we read each employee’s response from the given
data and mark a tally, denoted by the symbol 0, in the second column of Table 2.4 next to the
corresponding category. For example, the first employee’s response is that his or her job is somewhat
stressful. We show this in the frequency table by marking a tally in the second column next to the
category somewhat. Note that the tallies are marked in blocks of five for counting convenience. Finally,
we record the total of the tallies for each category in the third column of the table. This column is called
the column of frequencies and is usually denoted by f. The sum of the entries in the frequency column
gives the sample size or total frequency. In Table 2.4, this total is 30, which is the sample size.

 Relative Frequency and Percentage Distributions


- The relative frequency of a category is obtained by dividing the frequency of that
category by the sum of all frequencies. Thus, the relative frequency shows what
fractional part or proportion of the total frequency belongs to the corresponding
category. A relative frequency distribution lists the relative frequencies for all
categories.
- is the fraction or proportion of the frequency that the category appears in the data
set. It is calculated as
- Relative frequency of a category = frequency of that category
Sum of all frequencies
percent = 100 × Relative Frequency

Example:
 Determine the relative frequency and percentage distributions for the data of Table 1
Solution The relative frequencies and percentages from Table 2.4 are calculated and listed in Table 2.5.
Based on this table, we can state that .333, or 33.3%, of the employees said that their jobs are very
stressful. By adding the percentages for the first two categories, we can state that 80% of the employees
said that their jobs are very or somewhat stressful. The other numbers in Table 2.5 can be interpreted
the same way. Notice that the sum of the relative frequencies is always 1.00 (or approximately 1.00 if
the relative frequencies are rounded), and the sum of the percentages is always 100 (or approximately
100 if the percentages are rounded).

Graphical Presentation of Qualitative Data


 Bar Graphs
 Pie Charts

1. Bar Graph
- A graph made of bars whose heights represent the frequencies of
respective categories.
- Instead of frequencies a bar graph might display the relative
frequencies or percentages of the categories.
Example:

Stress on Frequenc Relative Percentag


Job y Frequenc e
y

Very 10 .333 33.3

Somewh 14 .467 46.7


at

None 6 .200 20.0

Total : 30 1.000 100%

2. Pie Chart
- A circle divided into portions that represent the relative frequencies or percentages
of a population or a sample belonging to different categories
- The size of the slice representing a particular category is proportional to the
corresponding frequency (relative frequency) that fall within this category.
- slice size=category relative frequency · 360

Example:

Stress on Job
None Somewhat
Very

20%
33%

47%

Organizing and Graphing Quantitative Data


 Frequency Distribution for Quantitative Data
The Table gives the weekly earnings of 100 employees of a large company. The first column lists the
classes, which represent the (quantitative) variable weekly earnings. For quantitative data, an interval
that includes all the values that fall within two numbers—the lower and upper limits—is called a class.
Note that the classes always represent a variable. As we can observe, the classes are nonoverlapping;
that is, each value on earnings belongs to one and only one class. The second column in the table lists
the number of employees who have earnings within each class. For example, 9 employees of this
company earn $801 to $1000 per week. The numbers listed in the second column are called the
frequencies, which give the number of values that belong to different classes. The frequencies are
denoted by f.
A frequency distribution for quantitative data lists all the classes and the number of values that belong
to each class. Data presented in the form of a frequency distribution are called grouped data.

Class Boundary

- The class boundary is given by the midpoint of the upper limit of one class and the
lower limit of the next class.

Class Width

- Or Class Size
- The difference between the two boundaries of a class
- Class width = Upper boundary - Lower boundary
- 1000.5 - 800.5 = 200

Class Midpoint or Mark

- is obtained by dividing the sum of the two limits (or the two boundaries) of a class
by 2.
- Class midpoint or mark =
- Lower limit + Upper limit
- 2
- 801 + 1000
- 2
- = 900.5

Example:
The following data give the total number of iPods

 sold by a mail order company on each of 30 days. Construct a frequency distribution table.

8  25  11  15  29  22  10  5  17  21


22  13  26  16  18  12  9  26  20  16

23 14 19 23 20 16 27 16 21 14

The minimum value is 5, and the maximum value is 29. Suppose we decide to group these data using
five classes of equal width. Then,

Now we round this approximate width to a convenient number, say 5.The lower limit of the

first class can be taken as 5 or any number less than 5. Suppose we take 5 as the lower limit of the first
class. Then our classes will be 5 – 9, 10 – 14, 15 – 19, 20 – 24, and 25 – 29

Now we read each value from the given data and mark a tally in the second column of Table 2.9 next to
the corresponding class. The first value in our original data set is 8, which belongs to the 5–9 class. To
record it, we mark a tally in the second column next to the 5–9 class. We continue this process until all
the data values have been read and entered in the tally column. Note that tallies are marked in blocks of
five for counting convenience. After the tally column is completed, we count the tally marks for each
class and write those numbers in the third column. This gives the column of frequencies. These
frequencies represent the number of days on which iPods indicated in classes are sold. For example, on
8 of 30 days, 15 to 19 iPods were sold.

Calculating Relative
Frequency and Percentage

Calculate the relative frequencies and percentages for Table 2.9. Solution The relative frequencies and
percentages for the data in Table 2.9 are calculated and listed in the third and fourth columns,
respectively, of Table 2.10. Note that the class boundaries are listed in the second column of Table 2
Using Table 2.10, we can make statements about the percentage of days with iPods sold within a certain
interval. For example, on 20% of the days, 10 to 14 iPods were sold. By adding the percentages for the
first two classes, we can state that 5 to 14 iPods were sold on 30% of the days. Similarly, by adding the
percentages of the last two classes, we can state that 20 to 29 iPods were sold on 43.4% of the days.

Graphing Quantitative Data


1. Histograms

2. Frequency Polygons

3. Cumulative Frequency

4. Ogive

5. Stem and Leaf Displays

6. Dot Plots

(quantitative) data can be displayed in a histogram or a polygon. This section describes how to construct
such graphs. We can also draw a pie chart to display the percentage distribution for a quantitative data
set. The procedure to construct a pie chart is similar to the one for qualitative data explained in Section
2.2.3; it will not be repeated in this section.

1. Histogram
- is a graph in which classes are marked on the horizontal axis and the
frequencies, relative frequencies, or percentages are marked on the vertical axis.
The frequencies, relative frequencies, or percentages are represented by the heights
of the bars. In a histogram, the bars are drawn adjacent to each other.

Example:

2. Polygons
- A graph formed by joining the midpoints of the tops of successive bars in a
histogram with straight lines.
- A polygon with relative frequencies marked on the vertical axis is called a relative
frequency polygon. Similarly, a polygon with percentages marked on the vertical
axis is called a percentage polygon

Example:

3. Cumulative Frequency of Distribution


- A cumulative frequency distribution gives the total number of values that fall below
the upper boundary of each class.

Example:
Using the frequency distribution of Table 2.9, reproduced here, prepare a cumulative frequency
distribution for the number of iPods sold by that company.

4. Ogive
- An ogive is a curve drawn for the cumulative frequency distribution by joining
with straight lines the dots marked above the upper boundaries of classes at heights
equal to the cumulative frequencies of respective classes.

Example:
When plotted on a diagram, the cumulative frequencies give a curve that is called an ogive (pronounced
o-jive ). Figure 2.12 gives an ogive for the cumulative frequency distribution of Table 2.14. To draw the
ogive in Figure 2.12, the variable, which is total iPods sold, is marked on the horizontal axis and the
cumulative frequencies on the vertical axis. Then the dots are marked above the upper boundaries of
various classes at the heights equal to the corresponding cumulative frequencies. The ogive is obtained
by joining consecutive points with straight lines. Note that the ogive starts at the lower boundary of the
first class and ends at the upper boundary of the last class.

One advantage of an ogive is that it can be used to approximate the cumulative frequency for any
interval. For example, we can use Figure 2.12 to find the number of days for which 17 or fewer iPods
were sold. First, draw a vertical line from 17 on the horizontal axis up to the ogive. Then draw a
horizontal line from the point where this line intersects the ogive to the vertical axis. This point gives the
cumulative frequency of the class 5 to 17. In Figure 2.12, this cumulative frequency is (approximately) 13
as shown by the dashed line. Therefore, 17 or fewer iPods were sold on 13 days. We can draw an ogive
for cumulative relative frequency and cumulative percentage distributions the same way as we did for
the cumulative frequency distribution

5. Stem and Leaf Displays


- - is a graphical method of displaying data. It is particularly useful when your data are
not too numerous.
- - In a stem-and-leaf display of quantitative data, each value is divided into two
portions—a stem and a leaf. The leaves for each stem are shown separately in a
display.

Example:
The following are the scores of 30 college students on a statistics test.

75 52 80 96 65 79 71 87 93 95

69 72 81 61 76 86 79 68 50 92

83 84 77 64 71 87 72 92 57 98
To construct a stem-and-leaf display for these scores, we split each score into two parts. The first part
contains the first digit, which is called the stem. The second part contains the second digit, which is
called the leaf. We observe from the data that the stems for all scores are 5, 6, 7, 8, and 9 because all
the scores lie in the range 50 to 98

After we have listed the stems, we read the leaves for all scores and record them next to the
corresponding stems on the right side of the vertical line. The complete stem-and-leaf display for scores
is shown

6. Dot Plots
- Outliers or Extreme Values - Values that are very small or very large relative to the
majority of the values in a data set are called outliers or extreme values

Example:
The table lists the lengths of the longest field goals (in yards) made by all kickers in the American
Football Conference (AFC) of the National Football League (NFL) during the 2008 season. Create a dot
plot for these data.
Step 1. The minimum and maximum values in this data set are 26 and 57 yards, respectively. First, we
draw a horizontal line (let us call this the numbers line) with numbers that cover the given data as
shown in the line. Note that the numbers line shows the values from 25 to 57.

Step 2. Place a dot above the value on the numbers line that represents each distance listed in the table.
For example, S. Hauschka’s longest successful field goal of the 2008 season was 54 yards. Place a dot
above 54 on the numbers line as shown in the number line. If there are two or more observations with
the same value, we stack dots vertically above each other to represent those values. For example 53
yards was the distance of the

longest field goals made by four players. We stack four dots (one for each player) above 53 on the
numbers line

Dot plots are also very useful for comparing two or more data sets. To do so, we create a dot
plot for each data set with numbers lines for all data sets on the same scale. We place these data sets on
top of each other, resulting in what are called STACKED DOT PLOTS. Example 2shows this procedure.

Example 2:
Refer to Table , which gives the distances of longest completed field goals for all kickers
in the AFC during the 2008 NFL season. Table 2.17 provides the same information for the kickers in the
National Football Conference (NFC) of the NFL for the 2008 season. Make dot plots for both sets of data
and compare these two dot plots.
Reference:
https://onlinestatbook.com/2/graphing_distributions/graphing_distributions.html
https://k101.unob.cz/~neubauer/pdf/Introductory%20Statistics-Mann.pdf
https://www.academia.edu/9547067/ORGANIZING_AND_GRAPHING_DATA
https://k101.unob.cz/~neubauer/pdf/descriptive_statistics1.pdf
https://academic.macewan.ca/burok/Stat141/notes/organize.pdf
https://www.math.arizona.edu/~jwatkins/statbook.pdf

You might also like