Professional Documents
Culture Documents
DESCRIBING DATA
• Descriptive statistical methods are
used to summarize data sets so
that we can extract the relevant
information
• Bar charts, pie charts, and
frequency distributions are
employed to summarize single
sets of nominal data.
• Interval Data
➢ Histogram – created by
drawing rectangles whose • Number of Modal Classes – a
bases are the intervals and mode is the observation that
whose heights are the occurs with the greatest frequency.
frequencies. A modal class is the class with the
➢ Classes – create a largest number of observations
frequency distribution for • Unimodal histogram – is one with
interval data by counting a single peak. A special type of
the number of symmetric unimodal histogram is
observations that fall into one that is bell shaped.
each series of intervals.
➢ Class Interval
- sturges formula
Number of class
intervals = 1 + 3.3 • Bimodal histogram – is one with
two peaks not necessarily equal in
height
• Cross-sectional Data –
Observations at the same point in
time
Scatter Diagram
• Time-series Data – represent
- To know the relationship of two
measurements at successive
Interval Data
points in time
- The two most important
➢ Line Chart – plot of
characteristics are the strength
variable over time
and direction of the linear
relationship.
MEASURING INFLATION
- To determine the strength of the
• Inflation – is the increase in the
linear relationship: draw a straight
prices for goods and services.
line through the points in such a
• Consumer Price Index (CPI) –
way that the line represents the
works with basket of some 300
relationship. If most points fall
goods and services in the United
close to the line, there is a linear
States (also in other countries),
relationship.
including such diverse items as
food, housing, clothing,
transportation, health, and
recreation.
• Basket – is defined for the “typical”
or “average” middle-income
family, and the set of items and
their weights are revised
periodically (10 years- United
-
States; 7 years – Canada) - There are other types of
1. Compute the inflation adjusted relationships, such as quadratic or
values exponential one
➢ Use the CPI
2. Convert the CPI from months to
year
➢ One year as base for the
index
3. Compute the inflation adjusted
values. Use 2012 as the base year
➢ Compute the 2012 base
➢ Compute the 2012 CPI
➢ Compute the inflation
adjusted values
OTHERS
-
Direction - Figure without scale. No y-axis
➢ Positive – dependent scale.
variable increases when
independent increase
➢ Negative – dependent
variable decreases when
independent increase
- in interpreting the results of a
scatter diagram it is important to
understand that if two variables - Graphs with different caption. For
are linearly related it does not the same graph, interpretation
mean that one is causing the might be different due to the
other. We can express this more caption.
eloquently as Correlation is not
causation
GRAPHICAL EXCELLENCE
1. the graph represents large data
sets concisely and coherently.
Graphical techniques -> large data - Showing a big drop in your graph.
sets; Small Data sets -> table; One For this, percentage form in the y-
or two numbers -> sentence. axis is preferred.
2. The ideas and concepts the
statistics practitioner wants to
deliver are clearly understood by
the viewer. Chart is designed to
describe what would otherwise be
described in words.
3. The graph encourages the viewer
to compare two or more variables.
Graphs are often best used to - The first chart shows almost no
depict relationships between two difference in scale. But when
or more variables or to explain how adjusted, an increase in sales can
and why the observed results now be observed. Expanding the
occurred. scale is usually truncated (zigzag)
4. The display induces the viewer to to show the vertical axis begins not
address the substance of the data
and not the form of the graph.
5. There is no distortion of what the
data reveal.
GRAPHICAL DECEPTION
at zero value. ➢ Ogive: Relative Frequency
distribution
Time Series (measuring inflation)
➢ Line chart
➢ Scatter diagram
OGIVE
• Ogive is for graphical
representation
• Frequency distribution lists the
number of observations that fall
into each class interval.
CHEBYSHEV’S THEOREM
• A more general interpretation of
the standard deviation, which
applies to all shapes of histograms.
• The proportion of observations in
any sample or population that lie
within k standard deviations of the
mean is at least
➢ Mean absolute deviation
(MAD) – the mean
absolute deviation of a
dataset is the average •
For Skewed histogram • Percentile – the Pth percentile is
➢ When k=2, chebyshev’s the value for which P percent are
theorem states that at less than that value and (100 – P)%
least three-quarters (75%) are greater than that value.
of all observations lie • Quartile – measures of relative
within two standard standing for dividing dataset into
deviations of the mean quarters.
➢ When k=3, chebyshev’s • Q1 – first/lower Quartile; Q2 –
theorem states that at Second/middle quartile; Q3 –
least eight-ninths (88.9%) Third/upper quartile
of all observations lie
within three standard
deviations of the mean.
DESCRIPTIVE TECHNIQUE
➢ Coefficient of
Determination (r^2) –
➢ Coefficient of
measures the amount
Correlation
of variation in the
dependent variable
that is explained by the
variation in the
independent variable.
– we calculate it by
squaring the
coefficient of
correlation