You are on page 1of 8

Workshop at BM Birla College of Nursing, Kolkata, West

BengalMSc Nursing
PPTs for the sessions by Dr. Indranil Saha (Professor & HOD,
Community Medicine, IQ City Medical College, Durgapur)
Shubham Pandey

14 Jul

Fundamental of Correlation & Regression - 4th Session (1).ppsx

PowerPoint

Fundamental of Correlation & Regression - 4th Session.ppsx

PowerPoint

Fundamental of Sample Size & Technique - 3rd Session.ppsx

PowerPoint

Fundamentals of study design - 2nd Session.ppsx

PowerPoint
9 class comments

soma sarkhel14 Jul

ok sir.....thank you

Prathama Maity14 Jul

Thank You Sir

ratna biswas14 Jul

OK. Thank you sir

Sreyasree Saha14 Jul

Thank you

Tamashree Roy14 Jul

Ok sir .. thank you.

Nivedita Ghosh14 Jul

Okk sir

Kakali Mandal14 Jul

Ok. Sir thank you


Santana Samanta14 Jul

Ok sir, thank you.

dipanjana das14 Jul

Thank you sir.

Add class comment…

Page

29

30

Page 1 of 30
Fundamentals of Correlation and Regression
Dr. Indranil Saha
Professor & HOD
Community Medicine
IQ City Medical College, Durgapur
Dr. Indranil Saha: B M Birla College of Nursing 16/07/2020

Page 2 of 30
Correlation
Correlation represents relationship between two variables, like weight and height, weight and cholesterol, age and life
expectancy etc.
It is also known as simple bivariate correlation (means between two variables) or zero-order correlation.

Page 3 of 30
This relationship is displayed in scatter plot or scatter diagram. It shows how closely the points lie in relation to a straight
line.
There must be logical relationship between two variables
Relationship between two variables are established but causation is not determined
Page 4 of 30
Dr. Indranil Saha 16/07/2020

Page 5 of 30
Positive correlation between SBP and weight

Page 6 of 30
Positive correlation between height and SBP
Dr. Indranil Saha 16/07/2020

Page 7 of 30
Negative correlation between disease activity score (DAS) and HDL level

Page 8 of 30
No correlation between height and cholesterol
Dr. Indranil Saha 16/07/2020

Page 9 of 30

Page 10 of 30
Assumptions for correlation
At first, scatter diagram to check for linearity and homoscedasticity (variability in scores of one variable should be similar at
all values of other variable)
If the scores are evenly spread in a cigar shaped manner and a straight line is drawn through the main cluster of points
The data set are generated from random sample
If a curved line is found (suggesting a curvilinear relationship), then Pearson correlation coefficient can not be calculated

Page 11 of 30
r (Pearson’s)
Both the variables measured on a interval or ratio scale & normally distributed.
(rho) (Spearman’s)
One variable is on ordinal scale and other on either ordinal or higher scale.
Interpretation of correlation coefficient:
When r is more than 0.7: High correlation.
When r is between 0.3 – 0.7: Moderate correlation.
When r is less than 0.3: Weak correlation.
Dr. Indranil Saha 16/07/2020

Page 12 of 30
Evidence of normal distribution in a data set:
(i) Skewness and kurtosis
(ii) Kolmogorov-Smirnov test: A non-significant result (P value of more than 0.05) will indicate normality
(iii) Shapiro-Wilk W test
(iv) Histogram
(v) Quantile-quantile (Q-Q) plot
Dr. Indranil Saha 16/07/2020

Page 13 of 30
Distribution of the blood pressure according to body weight in both sexes.
120.34 75.81 116.63 71.45
r1 =0.77 r2 =0.76 r3 =0.76 r4 =0.63
p1 < 0.001 p2 < 0.001 p3 < 0.001 p4 < 0.001

Page 14 of 30
Distribution of the study subjects according to BDI score in different year of study.
Spearman’s correlation co-efficient (between academic year and BDI score): rho = - 0.219, P = 0.003

Page 15 of 30
Correlation matrix
Dr. Indranil Saha 16/07/2020

Page 16 of 30
Multiple correlation (R)
When there are two or more independent variables, the analysis concerning relationship between dependent and
independent variables is known as multiple correlation (denoted by R)
Partial correlation
Partial correlation measures separately the relationship between two variables in such a way that the effects of other related
variables are eliminated

Page 17 of 30
Regression:
Correlation gives degree & direction of relationship between 2 variables, whereas the regression analysis enables us to
predict the values of one variable (dependent) on the basis of other variable/s (independent).
Regression coefficient is a measure of change of one dependent variable with one unit change in independent variable.
Dr. Indranil Saha 16/07/2020
Page 18 of 30
Dependent variable continuous
Example: SBP
Dependent variable discrete – dichotomous
Example: HTN / Normo

Page 19 of 30
Linear Regression
Dependent variable is quantitative (preferably continuous)
Linear regression is a parametric test and is based on a linear relationship between variables.
Correlation - Independent & dependent variables > 0.3
Regression line : Method of least squares technique and the resultant line is called the least squares line

Page 20 of 30
Assumptions for Linear Regression
Relationship between two variables must be linear. In scatter plot all the points may not fall exactly on the line, rather it
should be closely scattered around it.
r value above 0.3
Extension of Pearson’s correlation coefficient
One dependent variable must be quantitative (preferably continuous), will be in ratio scale and should be normally
distributed.

Page 21 of 30
Assumptions for Linear Regression......
Independent variables can be either qualitative or quantitative and may belong to any scale
Large sample size
Multicollinearity and singularity must be absent
Free of outlier

Page 22 of 30
Simple Linear Regression
y (total Cholesterol level) = a + b (calorie intake)
One dependent variable like cholesterol level
One independent variable like calorie intake
Dr. Indranil Saha 16/07/2020

Page 23 of 30
Multivariable linear regression
y = a + bx1 + cx2 + dx3
y (total Cholesterol level) = a + b (calorie intake) + c (physical activity) + d (BMI)
Regression coefficients, also known as beta
Page 24 of 30
Output:
R-square value will indicate the amount of variance of dependent variable which can be explained by the model.
ANOVA table in output will indicate the statistical significance of the model.
Role of individual independent variable in relation to the dependent variable can be explained by regression coefficient:
unstandardized & standardized

Page 25 of 30
Interpretation of linear regression equation
Men aged 40 – 55 years
y DBP = 40 + 1.2 x age
Dr. Indranil Saha 16/07/2020

Page 26 of 30
Logistic Regression
When the dependent variable is qualitative
This qualitative dependent variable either may be dichotomous / binary (having two categories) or polychotomous (with
more than two categories)
Logistic regression is an example of nonlinear regression

Page 27 of 30
Assumptions for Logistic Regression
One dependent variable must be qualitative in nature having dichotomous or polychotomous characteristics
The independent variables can be either qualitative or quantitative and may belong to any scale
Unlike linear regression, logistic regression can be performed straightway without doing any correlation

Page 28 of 30
Selection of independent variables is crucial and one has to assess the fit of the model. Independent variables with P value
less than 0.25 in simple regression model
There should be large sample size like linear regression
Multicollinearity and outliers should be absent
Dr. Indranil Saha 16/07/2020

Page 29 of 30
Important output of logistic regression model:
Significant Omnibus test and non-significant Hosmer-Lemeshow test support good fit of the model
Cox & Snell R2 and Nagelkerke R2: Variation
Classification table – correct outcome explained
Adjusted odds ratio (AOR) / Exp(B): Role of individual independent variable : > or < 1 with 95% CI
Page 30 of 30
Thank you

...

Open with Google Slides

Page 29 of 30

You might also like