Professional Documents
Culture Documents
Chapter Five:: Analyses and Interpretation of Data
Chapter Five:: Analyses and Interpretation of Data
Percentage
Measure of dispersion
Data analysis…
(iii) Mode
Data analysis…
Mean – the sum of all measurements divided by the
number of observations in the data set
Median: is the number present in the middle when
the numbers in a set of data are arranged in
ascending or descending order.
Mode: is the value that occurs most frequently in a
set of data.
◦ This is the only central tendency measure that can
be used with nominal data, which have purely
qualitative category assignments.
Example: Measures of Central Tendency
(Arithmetic Mean)
The arithmetic mean is the average of all the values
under consideration
Branch Revenue
1 50,000,000
2 150,000,000
3 40,000,000
4 60,000,000
Total = 300,000,000
Arithmetic
Arithmetic Mean
Mean == 300,000,000
300,000,000 // 44 == 75,000,000
75,000,000
Example: Measures of Central Tendency
(Median)
The Median is the midpoint of the distribution of values
under consideration
10
9
8
7
Frequency
6
5
4
3
2
1
0
1 2 3 4 5 6 7 8 9
Data analysis…
The shape of the distribution is said to be skewed if the
observations are not symmetrically distributed around
the center.
Positively Skewed Distribution
A positively skewed 12
Frequency
right) has a tail that extends 6
of positive values.
0
1 2 3 4 5 6 7 8 9
A negatively skewed
Negatively Skewed Distribution
12
negative values. 0
1 2 3 4 5 6 7 8 9
Data analysis…
Measurement of Dispersion
Dispersion measure how the value of an item is
scattered around the true value of the average.
It is a measurement of how far is the value of the
variable from the average value.
Important measures of dispersion are:
Range: difference between the max & min value of
an observed variable.
Mean deviation: It is the average dispersion of an
observation around the mean value.
Standard deviation: is defined as the square-root
of the average of squares of deviations.
When the distribution of item in a series
happens to be perfectly symmetrical
Data Presentation
• Data in raw form are usually not easy to use for decision
making
• Some type of organization is needed
• Table
• Graph
• Data presentation: The process of transforming a mass of raw
data into tables and charts-as a part of making sense of the data.
• Refers to the preparation of data in a manner that could be used
by general audience
• Tables:
◦ They can be used with just about all types of numerical data.
• Graphical
• The type of graph to use depends on the variable being
summarized
Data presentation: The Frequency
Distribution Table
Summarize data by category
(Variables are
categorical)
Data presentation-Cont’d
1000
2000
3000
4000
5000
0
Cardiac
Care
Em ergency
Intensive
Care
Matern ity
Simple Bar Chart Example
Hospital Patients by Unit
Su rg ery
A multiple bar Chart Example
Sales by quarter for three sales territories:
60
50
40
East
30 West
North
20
10
0
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
A Component bar Chart Example
• Sales by quarter for three sales territories:
Pie chart:
Intensive Care
(Percentages are 4%
Maternity
rounded to the
nearest percent) 6%
Inferential Analysis
complete population.
Seek to determine the relationship between
variables and test statistical significance.
Inferential Analysis…
Two questions should be answered to determine the
relationship between variables:
(i) Does there association between the two (or more)
variables? If yes, of what degree?
This question is answered by the use of correlation
technique.
In case of bivariate population, correlation can be found using
◦ Karl Pearson’s coefficient of correlation: It is simple
correlation and commonly used
◦ Charles Spearman’s coefficient of correlation
Measures of relationship:
Need to determine whether there is a
relationship between variables
Correlation
• Magnitude
• Direction
Tests of significance
There are two general classes of significance tests:
Parametric hypothesis testing Non-parametric hypothesis testing
• When the data are interval-or • when data are either ordinal or
ratio-scaled (gross national nominal
product, industry sales volume)
and sample size is large •Examples: Chi-square,
•It assumes that the data in the Kolmogorov-Smirnov test
study are drawn from population
with normal (bell-shaped)
distributions and /or normal
sampling distribution
Example:
◦ Null = no difference in brain activation between
these 2 conditions
◦ Exp = there is a difference in brain activation
between these 2 conditions
T-test
A t-test helps to compare whether two groups have
different average values
◦ for example, whether men and women have different
average heights.
This analysis is appropriate whenever you want to
compare the means of two groups.
Figure 1. Idealized distributions for treated and comparison group posttest values.
One sample t-test
Impact on one independent variable on
dependent/response variable
◦ Eg. Number of patient on weekly sales of the
store
Paired samples
Paired samples t-tests typically consist of a sample
of matched pairs of similar units, or one group of
units that has been tested twice (a "repeated
measures" t-test)
◦ Paired sample t-test is used in ‘before-after’ studies.
A typical example of the repeated measures t-test would
be where subjects are tested prior to a treatment, say for
high blood pressure, and the same subjects are tested
again after treatment with a blood-pressure lowering
medication.
By comparing the same patient's numbers before and
after treatment
Independent/unpaired samples/The two-sample
t-test
The independent samples t-test is used when two
separate sets of independent and identically
distributed samples are obtained, one from each of
the two populations being compared.
It tests for significant differences in the means of
two distinct populations.
◦ For example, we can use this test to see if there are
significant differences in how men and women score the
new concept
◦ Eg. Name of two teachers who teach same course
different section
◦ Test its effect on students’ grade/score
Analysis of variance/ANOVA
ANOVA, is a technique from statistical
interference that allows us to deal with several
populations.
A hypothesis test to compare the means of more
than two population
ANOVA test assumes three things:
◦ The population sample must be normal
◦ The observations must be independent in each
sample
◦ Homogeneity: Homogeneity means that the
variance between the groups should be
approximately equal.
Analysis of variance/ANOVA
These assumptions can be tested using statistical
software.
◦ The assumption of homogeneity of variance can be tested
using tests such as Levene’s test or the Brown-Forsythe Test.