You are on page 1of 47

Analyzing Performance

Data

(or how to convert your numbers to


information)
Nickcy Mbuthia
Data Analysis

Even the greatest amount and


best quality data mean
nothing if not properly
analyzed—or if not analyzed
at all
Data analysis.
What is data and what is information???
 Data refers to raw, unprocessed numbers,
measurements, or text.
 Information refers to data that are
processed, organized, structured, or
presented in a specific context.
 The process of transforming data into
information is data analysis.
Data Analysis

 Analysis does not mean using computer


software package
 Analysis is looking at the data in light of the
questions you need to answer
Answering programmatic
questions
 Question: Is my organization/department
meeting its objectives?
 Analysis: Compare departmental targets and
actual departmental performance to learn
how far you are from target.
 Interpretation: Why you have or have not
achieved the target and what this means for
your department.
 May require more information.
Types of Data
 Discrete Data: limited number of choices
 Binary: two choices (yes/no)
 Dead or alive
 Disease-free or not
 Categorical: more than two choices, not
ordered
 Race
 Age group
 Ordinal: more than two choices, ordered
 Stages of a cancer
 Likert scale for response
 E.G. strongly agree, agree, neither agree or
Types of data
 Continuous data
 Theoretically infinite possible values (within
physiologic limits), including fractional values
 Height, age, weight
 Can be interval
 Interval between measures has meaning.
 Ratio of two interval data points has no meaning
 Temperature in celsius, day of the year).
 Can be ratio
 Ratio of the measures has meaning
 Weight, height
Types of Data: Why important?

 The type of data defines:


 The summary measures used
 Mean, Standard deviation for continuous
data
 Proportions for discrete data
 Statistics used for analysis:
 Examples:
 T-test for normally distributed continuous
data
Descriptive analysis

 Describes the sample/target


population (demographic & clinic
characteristics)
 Does not define causality – tells you
what, not why
 Example – average number of
clients seen per month
Basic terminology and concepts

 Statistical terms
 Ratio
 Proportion
 Percentage
 Rate
 Mean
 Median
Ratio
• Comparison of two numbers expressed as:
– a to b, a per b, a:b
• Used to express such comparisons as clinicians
to patients or beds to clients
• Calculation a/b

• Example – In Hospital X, there are 600 nurses


and 200 beds. What is the ratio of nurses to
beds? 600
200
= 3 nurses per bed, a ratio of 3:1
Proportion
 A ratio in which all individuals in the numerator
are also in the denominator.
 Used to compare part of the whole, such as
proportion of all clients who are less than 15
years old
 Example: If 20 of 100 clients on treatment are
less than 15 years of age, what is the
proportion of young clients in the clinic?
 20/100 = 1/5
Percentage
 A way to express a proportion (proportion
multiplied by 100)
 Example: Males comprise 2/5 of the clients,
or 40% of the clients are male (0.40 x 100)
 Allows us to express a quantity relative to
another quantity. Can compare different
groups, facilities, countries that may have
different denominators
Rate
 Measured with respect to another
measured quantity during the same time
period
 Used to express the frequency of specific
events in a certain time period (fertility
rate, mortality rate)
 Numerator and denominator must be from
same time period
 Often expressed as a ratio (per 1,000)
Infant Mortality Rate

 Calculation
 # of deaths ÷ population at risk in same time
period x 1,000
 Example – 75 infants (less than one year)
died out of 4,000 infants born that year
 75/4,000 = .0187 x 1,000 = 18.7

19 infants died per 1,000 live births


Numeric Descriptive Statistics

 Measures of central tendency of data


 Mean
 Median
 Mode
 Measures of variability of data
 Standard Deviation
 Interquartile range
Mean

 The average of your dataset


 The value obtained by dividing the sum of a
set of quantities by the number of quantities
in the set
 Example: (22+18+30+19+37+33) = 159 ÷ 6 =
26.5
 The mean is sensitive to extreme values
Median
 The middle of a distribution (when numbers are
in order: half of the numbers are above the
median and half are below the median)
 The median is not as sensitive to extreme values
as the mean
 Odd number of numbers, median = the
middle number
 Median of 2, 4, 7 = 4
 Even number of numbers, median = mean of
the two middle numbers
 Median of 2, 4, 7, 12 = (4+7) /2 = 5.5
Sample Mode

 Infrequently reported as a value in studies.

 Is the most common value

 More frequently used to describe the distribution


of data
 Uni-modal, bi-modal, etc.
Standard Deviation
 This is a measure of how spread out
numbers are i.e. Just how far from the normal
 Using SD gives a ‘standard’ way of knowing
what is normal and what is “off”
 A low SD indicates the data points are very
close to the mean and vice versa
 It is used to measure confidence in statistical
conclusions
Standard Deviation

 E.g. Three datasets (0, 0, 14, 14), (0, 6, 8,


14) & (6, 6, 8, 8) all have a mean of 7. Their
SD are 7, 5 & 1 respectively. The 3rd dataset
has a much smaller SD than the others
because its values are all close to 7
 So if the SD is too high it shows that the data
is too widely spread and should raise
questions
Calculation
 Work out the mean: E.g.
600+470+170+430+300 = 1970/5= 394
 Calculate Variance:

(600-394)² +76²+ (-224)²+ 36²+(-94)² = 21704


5
 SD= Square root of 21704 = 147

 Therefore 68% of the data shall be around 364 ±


147 (1SD)
 95% of the data is around 364 ± 294 (2SD)
Key messages

 Purpose of analysis is to provide


answers to programmatic
questions
 Descriptive analyses describe the
sample/target population
 Descriptive analyses do not define
causality – that is, they tell you what, not
why
Data Presentation
Summarizing data

 Tables
 Simplest way to summarize data
 Data are presented as absolute numbers or
percentages
 Charts and graphs
 Visual representation of data
 Data are presented as absolute numbers or
percentages
Basic guidance when summarizing
data
 Every table or graph should have a title or
heading
 The x- and y-axes of a graph should be labeled,
include value labels, such as a percentage sign;
include a legend
 Always cite the source of your data and put the
date when the data were collected or published
 Provide the sample size or the number of people
to which the graph is referring (N)
 Include a footnote if the graphic isn’t self
explanatory
Tables: Frequency distribution

Set of categories with numerical counts

Year Number of births


1900 61
1901 58
1902 75

What would you add to this table to provide more


information?
Tables: Relative frequency

number of values within an interval


x 100
total number of values in the table

Year # births (n) Relative frequency (%)

1900–1909 35 27

1910–1919 46 34

1920–1929 51 39

Total 132 100.0


Tables
Percentage of births by decade between 1900 and 1929
Year Number of births (n) Relative frequency
(%)
1900–1909 35 27
1910–1919 46 34
1920–1929 51 39
Total 132 100.0
Source: U.S. Census data; 1900–1929.

To interpret this table, we should look at the relative


frequencies. What do they tell us?
Charts and graphs

 Charts and graphs are used to portray:


 Trends, relationships, and comparisons
 The most informative are simple and self-
explanatory
 Although they are easier to read than tables,
charts provide less detail. The loss of detail
may be replaced by a better understanding of
the data.
Use the right type of graphic

 Charts and graphs


 Bar chart: comparisons, categories of data
 Line graph: display trends over time
 Pie chart: show percentages or proportional share
Bar chart
Comparing
6
categories
5

4
Site 1
3
Site 2
2 Site 3

0
Quarter 1 Quarter 2 Quarter 3 Quarter 4

What would you add to this chartto provide more


information?
Percentage of new enrollees tested for HIV at each site, by
quarter
6
% o f new enrollees tested for

5
4
3
HIV

2
Site 1
1 Site 2
0 Site 3
Quarter 1 Quarter 2 Quarter 3 Quarter 4
Months
Q1 Jan–Mar Q2 Apr–June Q3 July–Sept Q4 Oct–Dec

Data Source: Program records, AIDS Relief, January 2009 – December 2009.rce:
Quarterly Country Summary: Nigeria, 2008
Has the program met its goal?

Percentage of new enrollees tested for HIV at each site, by quarter

60%
% of new enrollees tested

50%
40%
for HIV

30% Site 1
20% Site 2
Site 3
10%
Target
0%
Quarter 1 Quarter 2 Quarter 3 Quarter 4

Data Source: Program records, AIDS Relief, January 2009 – December 2009.. uarterly
Country Summary: Nigeria, 2008
Stacked bar chart
Represent components of whole & compare wholes
Number of Months Female and Male Patients Have Been
Enrolled in HIV Care, by Age Group

Females 4 10

0-14 years
15+ years
Males 3 6

0 5 10 15
Number of months patients have been enrolled in HIV care

Data source: AIDSRelief program records January 2009 - 20011


Line graph
Displays trends over time
Number of Clinicians Working in Each Clinic During Years 1–4*
6

5
Number of clinicians

4
Clinic 1
3
Clinic 2
2 Clinic 3
1
*Includes doctors and nurses
0
Year 1 Year 2 Year 3 Year 4

What can be added to this graph to make it more


clear?
Line graph
Number of Clinicians Working in Each Clinic During Years
6 1-4*

5
Number of clinicians

4
Clinic 1
3
Clinic 2
2 Clinic 3

1
*Includes doctors and nurses
0
Y1
Year1995
1 Y2Year
19962 Y3Year
19973 Y4 1998
Year 4
Zambia Service Provision Assessment, 2007.
Pie chart
Contribution to the total = 100%
Percentage of All Patients Enrolled by Quarter
8%

10%

1st Qtr
2nd Qtr
3rd Qtr
23% 59% 4th Qtr

N=150
PivotTable® and PivotChart®
 Used to summarize, analyze, explore, and
present summary data.
 PivotChart report helps visualizes the
summary data in a PivotTable report, and to
easily see comparisons, patterns, and trends.
 Both a PivotTable report and a PivotChart
report enable you to make informed decisions
about critical data in your enterprise
Interpreting Data

Answering the “WHY”


Interpreting data
 Adding meaning to information by making
connections and comparisons and exploring
causes and consequences
 Move from the ‘what’ is happening in our
departments to the ‘why’ it is happening

Conduct
Relevance Reasons Consider
further
of finding for finding other data
research
Interpretation – relevance of
finding
 Adding meaning to information by making
connections and comparisons and exploring
causes and consequences

Conduct
Relevance Reasons Consider
further
of finding for finding other data
research
Interpretation – relevance of finding
 Does the indicator meet the target?
 How far from the target is it?
 How does it compare (to other time periods,
other facilities)?
 Are there any extreme highs and lows in the
data?
Interpretation – Possible causes?
• Supplement with expert opinion
• Others with knowledge of the program or target
population
For example, if your data show that you have not met your
targets, you may want to know if: the doctors know the necessity
of foot exams in diabetic patients?

Conduct
Relevance Reasons Consider
further
of finding for finding other data
research
Interpretation – Consider other data
Use routine service data to clarify questions
- Calculate doctor-to-client ratio, review, availability of
equipment, etc.
Use other data sources because descriptive statistics do not
show causality

Conduct
Relevance Reasons Consider
further
of finding for finding other data
research
Interpretation – conduct further
research
 Data gap develop a QI project
 Methodology depends on questions being
asked and resources available

Relevance Reasons Consider Develop a


of finding for finding other data QI project
Key messages

 Use the right graph for the right data


 Tables – can display a large amount of data
 Graphs/charts – visual, easier to detect patterns
 Label the components of your graphic
 Interpreting data adds meaning by making
connections and comparisons to program
 Service data are good at tracking progress &
identifying concerns – do not show causality

You might also like