Professional Documents
Culture Documents
NURSING
Dr. R. C. Ram
Professor (Demography & Statistics)
Department of Community Medicine
Pt. JNM Medical College, Raipur
Means of data analysis
• Manual
• M. S. Excel programme
• Software packages:
i) Statistical Analysis Systems (SAS)
- a most comprehensive statistical system
• Nominal
• Ordinal
• Numerical (quantitative)
Nominal/ categorical data
•Sex
•marital status
•caste
•religion
•residence etc.
•Weight
•age
•height
•blood pressure
•income etc.
(Studied by using measures of average,
variability, skewness, correlation, regression etc.)
Types of Analysis
By Tables
By Drawings
Tabular Presentation
Contingency tables r*c (for qualitative i.e.
nominal/ categorical data)
• Bar chart
• Circular diagram
• Map diagram
• Pictogram
Graphical display for numerical data:
• Histogram
• Frequency polygon
• Frequency curve
• Ogive
• Scatter plot
• Line graph
Measures of central tendency/ Averages
x
x
n
Ex. Diastolic BP in mmHg of 10 healthy persons
90, 70, 80, 84, 82, 72, 78, 84, 90, 80
x
x
870
87
n 10
S.D. =
(x x) 2
n
For n < 30 (small sample)
S.D. =
(x x) 2
n 1
Mean ± 1 S.D.: contains 68 % observations
r=
( x x )( y y )
(x x) * ( y y)
2 2
-1 ≤ r ≤ 1
Types Correlation coefficient
Negative and perfect r= -1
Negative and partial - 1< r < 0
No correlation r=0
Positive and partial 0<r<1
Positive and perfect r=1
However, the significance of r can only be
examined by using Student’s t test.
r
Using t- test, t n2
1 r 2
6 d 2
r 1
n(n 1)
2
Odds Ratio (OR)
Relationship between two nominal characteristics
That is, the relationship between a risk factor and
the occurrence of a given outcome (say disease).
Provides a way to look at risk in case- control
studies.
In case- control studies one group with disease and
the other without disease is taken opposite to cohort
studies
One of the two risk ratios namely OR and RR
OR = (Odds that a person with an adverse outcome was
at risk)/ Odds that a person without an adverse
outcome was at risk)
Odds Ratio (OR)
RR = 0.581
RR=0.58 which is < 1 means that patients with Aspirin
were 0.58 times more likely to have an MI than in
the placebo group.
Types:
i) Simple/linear regression
ii) Multiple regression
iii) Logistic regression
Linear regression
Only one explanatory (independent)variable is
used to predict an outcome
Correlation and regression measure only a
straight line or linear relationship between two
variables
Y= a+ bx is the regression equation of y on x,
where x is independent and y is dependent
variable
Regression coefficient
If y: dependent/ response variable
x: independent/ predictor variable
Then the regression coefficient of y on x
y y
byx =
xx
If R 1.23 = 1
Then the correlation is perfect.
Multiple Regression
It is generalization of simple regression
Two or more explanatory variables are used to predict
an outcome
All variables are numerical
Y = a + b1 x1 +b2 x2
Exa.
Predicted IS= 2.291 – 0.068*BMI – 0.004* Age
Logistic Regression
Also named as logistic model or logit model
Used for prediction of the prob. of occurrence of
an event , say HD
The outcome variable is binary/ dichotomous
The independent variables include both numerical
and nominal measures
Before applying the Logistic Regression, apply χ2
-test to determine whether an independent variable
adds significantly to the prediction
The logistic model for 3 predictors
logit (p) = ln (p/ 1-p) = b0+ b1x1 +b2 x2 +b3x3
p : prob. of the occurrence of the outcome, say DH
x1 : Sex ( male-1, female-0)
x2 : BP
x3 : Age
logit (p) = 2+ 1.2 (sex)+1.01 (BP)+1.04 (age)
ie i) males have 20% higher risk of HD than females
ii) for every unit increase in BP, the risk of HD
increases by a factor of 1.01 (or by 1%)
iii)for every 1 year increase in age, the risk of HD
increases by a factor of 1.04 (or by 4%)
II. Inferential Analysis
(Hypothesis Testing/ Tests of Significance)
• Parametric Tests
• Non- parametric Tests
Parametric Tests
Statistical tests that make assumption
regarding the distribution of the observations.
Some parametric tests are:
Z- test
t- test
F- test
ANOVA
MANOVA
ANCOVA
Non- parametric Tests
Statistical tests that make no assumption
regarding the distribution of the observations.
Also called distribution free methods.
Applications:
• Significance of difference between two
means
• Significance of difference between two
proportions
• Significance of difference between two
standard deviations
t – test (Student’s t – test)
(Small sample test: n< 30)
Applications:
• Significance of difference between two
means
i) Unpaired t – test (Independent samples)
ii) Paired t- test (Dependent samples)
• Significance of correlation coefficient
F – test
Applications:
• Equality of two variances
• Equality of several means :
(Analysis of Variance- ANOVA)
Analysis of Variance (ANOVA)
A statistical procedure that determines
whether any difference exists among 3 or more
groups of subjects on one or more factors.
F- test is used in ANOVA.
χ2- test can be extended for 3 or more groups
when the outcome is a categorical (counted).
When the outcome is numerical, means are
used, t-test can be used for comparison of 2
groups, and ANOVA can be used for
comparison among 3 or more groups.
Multivariate Analysis of Variance
(MANOVA)
An advanced statistical method that provides
a global test when there are multiple
dependent and independent variables, and the
independent variables are nominal.
It is a simple extension of univariate ANOVA
design.
If the results from MANOVA are statistically
significant, using multivariate statistic called
Wilks’ lambda, follow up ANOVAs may be
done to investigate the individual outcomes.
Problem:
A study was conducted to identify attitudinal
differences of nurses, nursing assistants and
residents (three groups) as barrier in effective
pain management . Information regarding
their beliefs about 12 components of chronic
pain management was collected.
Ans. The study involves 12 independent variables,
therefore if ANOVA is used , 12 univariate
ANOVAs would be needed.
Single MANOVA is the right choice.
Analysis of Covariance (ANCOVA)
A special type of ANOVA or regression used to
control for the effect of a possible confounding
factor.
A confounding factor is a variable more likely
to be present in one group of subjects than the
other that is related to the outcome of interest,
and thus potentially confuses or confounds the
results.
Problem:
In a study, when BMI alone is used to predict
Insuline Sensitivity(IS) in hyperthyroid women, the
regression equation was
IS = 2.336 – 0.07 7* BMI
which means that for every unit increase in
BMI, IS is predicted to decrease by 0.077.
Age is a confounding factor that affects BMI
as well as IS. A way to control for the possible
confounding effect of age is to include that variable
in the regression equation .
The regression equation with age included is
IS = 2.291 – 0.0045*Age - 0.068* BMI
Using this equation, the women’s IS level is
predicted to decrease by 0.068 for every unit
increase in BMI.
Women’s age BMI predicted IS
50 years 25 0.456
60 years 25 0.321
Some Non-Parametric Tests
Applications:
• Test of association / independence of
two attributes
• Test of goodness of fit
Sign Median Test
A NPT used for testing a hypothesis about median
in a single group.
Problem: Standard median energy consumption
for 2 year children is 1286 kcal. Such data for 94
children is recorded. Did children in the study
have median level of energy intake?
Soln. H0 : Median intake= 1286 kcal
H1 : Median intake≠ 1286 kcal
Use the test statistic x n 1 / 2
z
n (1 )
Mann-Whitney Test/ Wilcoxon Rank Sum
Test/ Mann-Whitney-Wilcoxon Test