This document summarizes key points from a session by Dr. Saswata Ghosh on correlation analysis:
1) Correlation analysis can be used to analyze relationships between continuous variables like weight, height, and temperature, but does not prove causality.
2) The session analyzed a survey dataset of ever married women in West Bengal ages 15-49 to check for missing data and correlations between age and education.
3) Pearson's correlation showed a significant but low negative correlation between age and education at the 1% level. Partial correlation and regression can also be used to analyze relationships while controlling for other variables.
This document summarizes key points from a session by Dr. Saswata Ghosh on correlation analysis:
1) Correlation analysis can be used to analyze relationships between continuous variables like weight, height, and temperature, but does not prove causality.
2) The session analyzed a survey dataset of ever married women in West Bengal ages 15-49 to check for missing data and correlations between age and education.
3) Pearson's correlation showed a significant but low negative correlation between age and education at the 1% level. Partial correlation and regression can also be used to analyze relationships while controlling for other variables.
This document summarizes key points from a session by Dr. Saswata Ghosh on correlation analysis:
1) Correlation analysis can be used to analyze relationships between continuous variables like weight, height, and temperature, but does not prove causality.
2) The session analyzed a survey dataset of ever married women in West Bengal ages 15-49 to check for missing data and correlations between age and education.
3) Pearson's correlation showed a significant but low negative correlation between age and education at the 1% level. Partial correlation and regression can also be used to analyze relationships while controlling for other variables.
Session by: Dr. Saswata Ghosh. Asst. Professor, IDSK.
o
Correlation analysis can be applied to continuous variables (e.g. Weight, Height,
temperature etc.) but it does not explaining causality.
Data set (Saswata_New): Survey on peoples of West Bengal. (Sample: Ever married
Women of age: 15-49)
First check the frequencies to ascertain that, there is no missing data i.e. inconsistencies or irrelevance in the data. (Note: in this case, we have performed on the variables: Age and education in single year) For missing value: percent shall not be matched with valid percent. Cumulative Freq. is calculated based on valid percent. Missing values are not calculated in valid percent. After that, perform Pearsons correlation: Correlate -> Bivariate between the same value . It shows ve
correlation at 1% significance level but the value is very low.
Partial: Correlate -> Bivariate -> Partial: enter two variables into variables and one
controlling variable (which we keep constant for this analysis)
If in the output, no particular significance level is mentioned, it indicates that, result
o o
is valid for both the level of significance i.e. 1% and 5%.
Causality: Regression; Correlation: Degree of association. Correlation analysis is valid for linear. For large data set, it is assumed that, the distribution is following normal distribution in which more or less linearity is followed.
Otherwise we need to ascertain linearity first. (Ref: ANOVA methodology)
(Variable mean)/ SD = Standardized value. For having clear inference, Standardize but in this case we lose original character of the result. Recommendation is first go for unstandardized one for explanation, if difficulty arises go for standardization.