You are on page 1of 108

Analisis dan Interpretasi Data

Mega Hasanul Huda, Ns.Sp.Kep.An., MARS., PhD


Research and
1 Statistics in Health
Care
Organizing,
2 Displaying, &
Describing Data
Statistical Inference:
3 probability and
normal distribution
Research and Statistics in Health Care
‘ Evidence-based practice
has become the standard by
which clinical and public
health guidelines are
produced ‘
Andrews & Redmond, 2004; McNaughton et al,
2004; Polit & Beck, 2008; Stevens, 2001
Descriptive Explanatory Prediction
Study Study & Control
Studies whose primary Studies that have the StudiesStudy
that conducted
purpose is to describe primary purpose to to determine which
and explore situation elucidate the variables are predictive
or event relationship among of other variables and
variables determine the
causality
Dissemination plan

Statement of
Description of the assumption and
planned statistical analysis limitation

Description of the research Description of the sample


design and how it was obtained

List of hypotheses to be Definition of key terms and


tested variables

Research questions to
be answered by the
Theoretical or study
conceptual
framework Statement of the
problem and its
significant

STUDY PLAN
Organizing, Displaying, & Describing
Data
Variables &
their
measurement
VARIABLE
‘ Any characteristic that
can and does assume
different values for the
different people, objects,
or events being studied ‘
Plichta & Kelvin, 2013
Representing categories that can
be placed in a meaningful
numerical order

Codes in Measured with


representing numbers that can
categories or be placed in
characteristics meaningful
numerical order,
have equal
interval, and have
‘true zero
Measured with numbers that can
be placed in meaningful numerical
order and interval between values
‘ It is usually best to gather data at the
highest level of measurement for research
variables since this permits the researcher to
perform more mathematical operation and
gain greater precision in measurement ‘
The prevalence of anemia among pregnant woment in Indonesia
within 5 years
No of HB level No of Age (year)
respondent respondent
1 7 11 9
2 12 12 11
3 11 13 8
4 10 14 6.5
5 9 15 11
6 7 16 6
7 6 17 7
8 10.5 18 11.5
9 10 19 8
10 11 20 8

Change it into nominal, ordinal, interval data


USING VISUAL
DISPLAY TO
DESCRIBE DATA
Table Chart
• Table will help the readers to glean • Chart, the visual
information about the central tendency, representations of frequency
dispersion, and outliers distributions, provide a global,
• Do not try to do too much in a table bird’s-eye view of the data and
• Use white space help the readers gain insight
• Order the data in array • Chart could help the readers
• Values are compared down column to see the data’s characteristic
• No more than 2 decimals through the skew-ness
• Sort the data into class of intervals • Consist of x-axis (class of
• Not everything displayed in a table needs interval) and 𝑦-axis (raw and
to be mentioned in the text relative frequency) *
• If a finding well explained in words, then • The length of 𝑦-axis should be
table is not necessary roughly 2/3 or ¾ of the x-axis *
• Table should be as self-explanatoy as
possible
• Title: state the variable, when, where, size
of sample.
Histogram
• The bar of histogram is touch
• If too few bar, there might be lost of some information
• If too many bar, it will looks cluttered
12 11 12 11

10 10

8 8
6 6
6 5 6 5
4 4 4 4
4 4
2 2 2 2
2 2

0 0

Frequency polygon
• Is a chart for interval or ratio variable
• Smoother than histogram
• Total area 100%
• Constructing using histogram
• Dot placed in the middle of each interval bar
• The dots are connected in order to straight lines then the histogram
erased leaving a rough estimate of the shape of data distribution
No of Age Stem Leaf
respondent
1 7
2 11 0 7 9
3 11
1 1 1 5
4 23
5 9 2 3 6
6 37
3 7
7 26
8 40 4 0 2
9 42
10 15
Steam & Leaf Display
• Known as stemplots
• Alternative way of graphing data
• Similar to histogram
• Advantage: preserve the individual values of the
variable
Percentile rank,
cumulative
frequency, &
ogives
• The percentile rank of a value
• 50th Percentile (50%) or 𝑃50
placed in the middle of data
• Ogive is the graph to obtain
percentile
12 11

10
12%
8
35%
6
6 5 24%
4 4
4
2 2
2
29%

0
100-119 120-139 140-159
160-179 180-199

Bar Chart Pie Chart


• Is a chart for nominal or ordinal data • Is alternative to the bar chart
• Drawn to represent the frequency or • Represent the percentage of each
percentage in each category category
• Each bar should separated • Suggestions:
• The width and space are at researcher’s ✓ No more than 6 sectors
discretion and should be equal ✓ Read by clockwise start from 12.00
✓ Use a low-key shading pattern to
not distract the meaning
DESCRIBING DATA WITH
SIMPLE STATISTICS
Central Characteristics that can be
Tendency Dispersio Skewness Kurtosis described in descriptive
n statistics
Central
Tendency
Mode Median Mean
The most frequently Value that placed in the Arithmetic average of
occurring value middle of the the distribution and the
distribution measure of central
Using to describe tendency
nominal data Appropriate for
ordinal, interval, and The most appropriate
ratio level variables to describe ratio and
interval level data
The respondent age:
7, 9, 11, 11, 15, 23, 26, 37, 40, 42

State the mode, median and mean


from the data above.
Dispersion
Range Interquartile range
Hishgest and lowest • Is the middle 50% of
value of a data the data that is the
75th and 25th percentile
• Used when median as
the central tendency

Coefficient of variation Standard deviation and


Is used when comparing variance
the variation of two or Is the square root of the variance,
more different variables shows the absolute distance of
that measure in each point from the mean
different units
Variance’s formula

SD’s formula Coefficient of Variation’s


formula
The respondents’ Case:
weight: A pair of shoes brand
‘A’ can be used for 11
No of Age
respondent years before broken
1 37 with SD 1.6 years. On
2 41 the other side, shoes
3 41 brand ’R’ can be used 2
4 53 years longer compared
5 69 to brand ’A’ with SD
6 33 2.5. Which shoes has
7 26 better quality?
8 40
9 42
10 25
PROBABILITY &
THE NORMAL DISTRIBUTION
Also known as
Posteriori probability

Also known as empirical


2
or relative frequency
theoretical or classical probability, is the
probability, is the distribution of events that
distribution of events the data should be
that can be inferred collected by some process
without collecting data and the probability
must be estimated

1
from the data
Priori probability
Sample Probability
space distribution
Is the set of all Is the set of
possible outcome probabilities
of a study associated with each
possible outcome in
the sample space
MARGINAL CONDITIONAL JOINT
Number of times the The probability that The cooccurrence of
events occurred one event will occur
divided by total two or more event
numbers of times that given that another
it could have occurs event has occurred
Washing hand before Diarrhea
eat behaviour Yes No
Yes 12 33
No 40 10
Total 52 43

Conditional Joint
Marginal probability probability probability
p (A) =
# Times _ A _ occurs p (A) = 12 = 0.342 p (A) = 40 = 0.421
N 35 95
p (A) = 45 40
p (A) = = 0.8 It means that 42.1%
95 50 people who has
p (A) = It means that 34.2% diarrhea also didn’t
0.473
people who has diarrhea washing their hand
It means that people who has relationship with before eat
wash their hand before eat washing hand behavior
have chance up to 47.3% compared to 80% who not
to experience diarrhea washing hand behavior
Sensitivity, specificity,
predictive value, and
efficiency

Screening Diagnosis
Condition present Condition Absent
Test Positive True positive (TP) False Positive (FP)
Type I error
Test Negative False Negative (FN) True Negative (TN)
Type II error
Sensitivity Specificity
Sn = TP x 100 Sp = TN x 100
TP + FN TN + FP

Positive predictive value Negative predictive value


PPV = TP x 100 NPV TN x 100
=
TP + FP TN + FN

Efficiency
EFF = TP + TN x 100
TP + TN + FP + FN
Normal Distribution
• Also known as Gaussian distribution
• Has single peak and symmetrical shape
• 𝜒=𝜇
• Mean, median and mode are equal
• The total are under the curve and above the
x-axis is equal to 1
Outline

Cross
tabulation
Independent
t-test and
Mann-
Statistical Whitney u-
inference test
Statistical inference
Point estimation –
sample mean,
median, variance &
SD

Parameter
estimation
Interval estimation –
CI with lower and
upper limits

Statistical inference

Hypothesis
testing
Hypothesis
• Key to health services research.

• We can develop and test hypotheses from good theoretical or


conceptual models and theoretical structures using representative
samples and appropriate research designs.

• Hypotheses help researchers to explain the expected relationships


between variables.

• A testable hypothesis identifies groups and variables being


compared and expected relationships.
Hypothesis

Hypothesis

Alternative
hypothesis - Null hypothesis - H0
H𝑎 𝑜𝑟 H𝑟

Directional Non-directional
Hypothesis testing
• Classical approach to assess the statistical significance of
research findings.

• The value of a computed statistic is significant when it is


different from what is expected by chance alone.

• Hypotheses are stated in their alternative forms. Inferential


statistics test the and a decision is made on the .

• The criteria to either reject or accept the is based on the α-


level and p-value.
Statistical significance
• p-value of a statistical test represents the probability
that the results were obtained by chance alone.

• The p-value is computed from the data and is not


known until the test is complete.

• α-level is the specific level of the p-value that is defined


as statistically significant.

• The common α-levels used are .10, .05 and .01.


Type of errors
Power of a study

• Ability to detect statistically significant differences (1 – β). The


specific equation to calculate power of a study depends upon the
type of comparisons being made.

• The four characteristics used in power analysis include α-level,


power (1 - β), sample size (n) and population effect size (ϒ).

• Effect size represents the magnitude of the association between


variables.

• The strategies to increase study power include increasing sample


size and α-level, increasing the α-level and using smaller effect sizes.
POWER AND EFFECT SIZE
Steps in hypothesis testing
i. State the hypothesis (null and alternative hypothesis).
ii. Define the significance level (α-level).
iii. Data should meet necessary assumptions to calculate
the test statistic.
iv. Calculate the parameters being compared by the test
statistic (means or proportions)
v. Calculate the test statistics and obtain the p-value of
the calculated statistic.
vi. Determine the statistical significance and state the
conclusion clearly.
Steps in hypothesis testing
Steps in hypothesis testing
Steps in hypothesis testing
Steps in hypothesis testing
Steps in hypothesis testing
Z statistic table
probability table
Confidence interval (ci)
• Mean from the sample data does not represent
exact value of the population mean.

• CI gives a range of values from which the


population mean can be found.

• Common CI used are 95% or 99%.

• Reliability coefficient based on z-scores(z-table)


and standard error of the mean are used for CI.
Confidence interval (ci)
Independent t-test and Mann-Whitney
test
• Some research studies focus on testing differences
between 2 groups.

• Grouping variable – independent variable or exposure or


hypothesized cause.

• Characteristic of interest – dependent variable or outcome.

• The independent sample t-test and Mann-Whitney U-test


are used to compare distribution of variables for 2 different
groups.
Independent sample t-test
• A parametric test.

• Assumptions for independent t-test.


i. The independent variable must be
dichotomous
ii. Independent
iii. Normal distribution
iv. Linearity
Steps in computing independent t-test
Steps in computing independent t-test
Steps in computing independent t-test
Steps in computing independent t-test
independent t-test example
Steps in computing independent t-test
(spss)
Steps in computing independent t-test
(spss)
Steps in computing independent t-test
(spss)
Steps in computing independent t-test
(spss)
independent t-test - sample size and
power
independent t-test – sample size and
power
MANN-WHITNEY t-test

• A non-parametric test

• Determines relationship between 2 variables when one variable


is dichotomous and the other variable is ordinal.

• Used when the independent t test assumptions are not met.


i. Small sample
ii. Non-normally distributed data
iii. Ordinal data
MANN-WHITNEY t-test example
Steps in computing MANN-WHITNEY t-
test (SPSS)
Steps in computing MANN-WHITNEY t-
test (SPSS)
Steps in computing MANN-WHITNEY t-
test (SPSS)
Steps in computing MANN-WHITNEY t-
test (SPSS)
Steps in computing MANN-WHITNEY t-
test (SPSS)
Independent t-test and Mann-Whitney
test
Independent t-test and Mann-Whitney
test study
CROSS-TABULATION TABLES
• Provide graphical display of the relationship of 2 categorical
variables to each other.
• These tables show joint probability distribution of the 2 variables
and used when both variables are nominal (ordinal with very
limited set of categories).
• Marginal, joint and conditional probabilities and unadjusted odds
ratios can be obtained.
• The statistical significance of a contingency table can be assessed
using the chi-square statistic, Fisher’s exact test or the McNemar
test.
• When the 2 variables are independent of each other – chi-square
statistic, chi-square statistic with Yate’s correction and Fisher’s test
can be used and when the 2 variables are not independent the
McNemar test can be used.
Chi-square statistic and related
statistics
Chi-square statistic

• A non-parametric test.

• Used when the following assumptions are


met.
i. The data are frequency data.
ii. There is an adequate sample size.
iii.The measures are independent of each
other.
Steps in computing Chi-square statistic
Steps in computing Chi-square statistic
Chi-square statistic example
Cross tabulation table
Marginal probabilities
Conditional probabilities and
unadjusted or
Steps in computing Chi-square statistic
(spss)
Steps in computing Chi-square statistic
(spss)
Steps in computing Chi-square statistic
(spss)
Steps in computing Chi-square statistic
(spss)
Steps in computing Chi-square statistic
(spss)
Chi-square statistic study
mcnemAR TEST

• Tests statistical significance of changes of 2


paired or non-independent measures of
dichotomous variables.

• The observations can be from a pretest-


posttest or matched control designs.
Steps in computing mcnemAR TEST
(spss)
mcnemAR TEST example
Steps in computing mcnemAR TEST
(spss)
Steps in computing mcnemAR TEST
(spss)
Steps in computing mcnemAR TEST
(spss)
Steps in computing mcnemAR TEST
(spss)
mcnemAR TEST study
references
• de Almeida Tavares, J. P., da Silva, A. L., Sá-Couto, P., Boltz,
M., & Capezuti, E. (2017). Percepção dos enfermeiros sobre
o cuidado a idosos hospitalizados-estudo comparativo
entre as regiões Norte e Central de Portugal. Revista Latino-
Americana de Enfermagem, 25, e2757.
• http://cfcc.edu/faculty/cmoore/0801-HypothesisTests.pdf
• Lee, C. Y., Hsu, H. C., & Lee, C. H. (2016). Effects of aging
simulation program on nurses’ attitudes and willingness
toward elder Care. Taiwan Geriatric Gerontol, 11(2), 105-
115.
• Plichta, S. B., Kelvin, E. A., & Munro, B. H. (2013). Munro s
statistical methods for health care research. Wolters Kluwer
Health/Lippincott Williams & Wilkins,.

You might also like