You are on page 1of 34

Basic Statistics

Frequency Distributions & Graphs

Rowina M. Twaño, DBA


STRUCTURE OF STATISTICS

TABULAR

DESCRIPTIVE GRAPHICAL

NUMERICAL/
STATISTICS
TEXTUAL

CONFIDENCE
INTERVALS
INFERENTIAL
TESTS OF
HYPOTHESIS
PRESENTATION OF
DATA
2 Ways to present data

1.Textual Presentation
2.Tabular Presentation
3.Graphical
Presentation
TEXTUAL PRESENTATIONOF DATA
Good statistical presentation requires making it
easy for readers to understand and interpret the
data, and to identify key patterns of trends.
Data presented in paragraph or in sentences.
Ex. The data in math test scores of 15 students
out of 50 items: 47, 48, 49, 42, 36, 38, 40, 35, 50, 44,
45, 45, ,50,50.
Findings: The lowest score is 35, and the highest
score is 50. Three students got a perfect score of
50, one got 35,36,38,40,47,48.
Conclusion: I therefore conclude that the students
perform well in the test.
TABULAR PRESENTATION OF DATA

Tables are useful for clear


presentation and comparison of large
numbers of data items. They also
allow data to be presented at a level
of detail which cannot usually be
determined from a text.
Frequency Distribution Table
• A histogram is one way to depict a frequency
distribution
• Frequency is the number of times a variable takes
on a particular value
• Note that any variable has a frequency distribution
• e.g. roll a pair of dice several times and record the
resulting values (constrained to being between and
2 and 12), counting the number of times any given
value occurs (the frequency of that value
occurring), and take these all together to form a
frequency distribution
Frequency & Distribution
• Frequencies can be absolute (when the frequency
provided is the actual count of the occurrences) or
relative (when they are normalized by dividing the
absolute frequency by the total number of
observations [0, 1])
• Relative frequencies are particularly useful if you
want to compare distributions drawn from two
different sources (i.e. while the numbers of
observations of each source may be different)
Step 2
Scores of 100 college students on the self-concept questionnaire.
Step 3

A possible first step in organizing data for interpretation is to


arrange the scores by size, usually from highest to lowest.
RELATIVE FREQUENCY DISTRIBUTION

● The relative frequency of a class is


obtained by dividing the class frequency
by the total frequency (% FORM)
Grouping and Loss of Information

More usable/
comprehensible tradeoff Precise Information
information

Ease of
Accuracy
communication
GRAPHIC PRESENTATION OF A
FREQUENCY DISTRIBUTION

•Histogram vs. Bar Graph


•Polygons (Line Graphs)
• Frequency/Relative Freq
HISTOGRAM
The Histogram is a series of column, each having as its
base one class interval as its height the number of cases,
or frequency, in that class.

FOR WATER USAGE (1,000 GALLONS)


Percent
Frequency
ordinate 25%
20%
15%
10%
5%
score
abscissa

Histogram is a graphing technique that is


appropriate for quantitative data.
To avoid having the figure appear too flat or too
steep, it is usually well to arrange the scales so
that the height of the histogram is
2/3 to 3/4 of its width.
Percent
25%
20%
15%
10%
5%

South North West

male female

When one is comparing two distributions that are


based on unequal numbers of observations,
percentages are preferable.
FREQUENCY POLYGON
In the polygon a point is located above the midpoint of
each class interval to represent the frequency in that
class. These points are then joined by straight lines.

15
10
5
0

10 15 20 25 30 35
5 40
The lowest class interval midpoints have zero frequencies.
Frequency polygons are closed at both ends.
THE BAR GRAPH

A Bar graph is used to present the frequencies


of the categories of qualitative variable. A
conventional bar graph looks exactly like a
histogram except for the wider spaces between
the bars.

A bar chart can be used to depict any of the


levels of measurement (nominal, ordinal,
interval, or ratio).
THE LINE
GRAPH
⮚A line graph is used to show a picture of the relationship
between two variables.
⮚A point on a line graph represents the value on the Y
variable that goes with the corresponding value on the X
variable.
PIE
CHART
⮚ A pie chart is especially useful in displaying a
relative frequency (percentage) distribution.
⮚ A circle is divided proportionally to the relative
frequency (percentage) and portions of the
circle are allocated for the different groups.

EXAMPLE
A sample of 214 college
students were asked to indicate
their favorite soft drink. The
results of the survey are given
on the next slide. Draw pie
chart for this information.
PIE CHART FOR THE TASTE TEST

Coca-
Cola

Peps Others
i

Dr. Seven-
Pepper Up
STATISTICAL
DISTRIBUTIONS
Review of Previous Lecture
• Range
– The difference between the largest and smallest values
• Interquartile range
– The difference between the 25th and 75th percentiles
• Variance
– The sum of squares divided by the population size or the
sample size minus one
• Standard deviation
– The square root of the variance
• Z-scores
– The number of standard deviations an observation is
away from the mean
Measures of Skewness and Kurtosis
• A fundamental task in many statistical analyses is
to characterize the location and variability of a
data set (Measures of central tendency vs.
measures of dispersion)
• Both measures tell us nothing about the shape of
the distribution
• A further characterization of the data includes
skewness and kurtosis
• The histogram is an effective graphical technique
for showing both the skewness and kurtosis of a
data set
Further Moments – Skewness
• Positive skewness
– There are more observations below the mean
than above it
– When the mean is greater than the median
• Negative skewness
– There are a small number of low observations
and a large number of high ones
– When the median is greater than the mean
Further Moments – Skewness

Source: http://library.thinkquest.org/10030/3smodsas.htm
Further Moments – Skewness

• Skewness measures the degree of asymmetry


exhibited by the data

• If skewness equals zero, the histogram is


symmetric about the mean
• Positive skewness vs negative skewness
Further Moments – Kurtosis
• Kurtosis measures how peaked the
histogram is

• The kurtosis of a normal distribution is 0


• Kurtosis characterizes the relative peakedness
or flatness of a distribution compared to the
normal distribution
Further Moments – Kurtosis
• Platykurtic– When the kurtosis < 0, the
frequencies throughout the curve are closer to be
equal (i.e., the curve is more flat and wide)
• Thus, negative kurtosis indicates a relatively flat
distribution
• Leptokurtic– When the kurtosis > 0, there are
high frequencies in only a small part of the curve
(i.e, the curve is more peaked)
• Thus, positive kurtosis indicates a relatively
peaked distribution
Further Moments – Kurtosis

platykurtic leptokurtic

Source: http://www.riskglossary.com/link/kurtosis.htm

• Kurtosis is based on the size of a distribution's tails.


• Negative kurtosis (platykurtic) – distributions with
short tails
• Positive kurtosis (leptokurtic) – distributions with
relatively long tails
Why Do We Need Kurtosis?

• These two distributions have the same variance,


approximately the same skew, but differ markedly
in kurtosis.
Source:
http://davidmlane.com/hyperstat/A53638.html
Functions of a Histogram
• The function of a histogram is to graphically
summarize the distribution of a data set
• The histogram graphically shows the following:
1. Center (i.e., the location) of the data
2. Spread (i.e., the scale) of the data
3. Skewness of the data
4. Kurtosis of the data
4. Presence of outliers
5. Presence of multiple modes in the data.
Functions of a Histogram

• The histogram can be used to answer the


following questions:
1. What kind of population distribution do the
data come from?
2. Where are the data located?
3. How spread out are the data?
4. Are the data symmetric or skewed?
5. Are there outliers in the data?
TO GOD
BE THE
GLORY!!!!

You might also like