Professional Documents
Culture Documents
something new to
experience!!!!
Always curious to learn!!!
We are here to learn those
interesting things!!!
What’s
now?
Today’s challenge is –
LOOKING FOR
RELATIONS AND
ASSOCIATIONS!!!
LOOKING FOR RELATIONS...
THINK
• Do you think credit card has boost up your purchasing power?
a) Strongly Agree
b) Somewhat Agree
c) Neither Disagree nor Agree
d) Disagree
e) Strongly Disagree
• Do you feel end up buying more while using credit card?
a) Strongly Agree
b) Somewhat Agree
c) Neither Disagree nor Agree
d) Disagree
e) Strongly Disagree
What is the thing
you wish to ‘PUT
ON TEST’?
What to do …?
Research Project:
MEASURES OF
ASSOCIATION FOR
NOMINAL DATA...
MEASURES OF
ASSOCIATION
MEASURES OF ASSOCIATION FOR NOMINAL DATA
where
= c2/n
2
Accounting 75 5 10 90
Finance 25 20 5 50
Other 20 5 35 60
Other 36 9 15 60
Count
ENTERTAINMENT
Dine with
Go for friends Go out for Any Other
shopping in Watch movie and family adventurous (please
malls etc in a theater members experience specify) Total
MARTIAL MARRIED 80 40 10 35 15 180
STATUS UNMARRIED 25 40 50 25 5 145
Total 105 80 60 60 20 325
Symmetric Measures
Nominal by Phi
Value Approx. Sig. What can you
.426 .000
Nominal Cramer's V .426 .000 Say about the
N of Valid Cases
a. Not assuming the null hypothesis.
325
Association?
b. Using the asymptotic standard error assuming the null
hypothesis.
SPSS Output …
GENDER * ENTERTAINMENT Crosstabulation
Count
ENTERTAINMENT
Dine with
Go for friends Go out for Any Other
shopping in Watch movie and family adventurous (please
malls etc in a theater members experience specify) Total
GENDER MALE 40 60 20 75 25 220
FEMALE 45 25 10 5 20 105
Total 85 85 30 80 45 325
Symmetric Measures
Count
ENTERTAINMENT
Dine with
Go for friends Go out for Any Other
shopping in Watch movie and family adventurous (please
malls etc in a theater members experience specify) Total
OCCUPATION STUDENTS 25 40 50 15 5 135
HOMEMAKER 30 12 8 2 5 57
SALARIED 25 20 10 5 3 63
SELF-EMPLOYED 8 7 4 1 3 23
Total 88 79 72 23 16 278
Symmetric Measures
MEASURES OF
ASSOCIATION FOR
ORDINAL DATA...
MEASURES OF ASSOCIATION…
ORDINAL DATA
· Measures of association between ordinal
data are classified into two groups -
– That which are based upon the concept of rank order
correlation
– That which are based upon the concepts of
agreement/concordance or disagreement/discordance.
METHOD BASED UPON RANK ORDER CORRELATION
CONCEPT :
· DISAGREEMENT/DISCORDANCE(D):It means
degree of disharmony/disagreement between
two ranks. Two pairs (X1,Y1) and (X2,Y2)
are said be discordant if
• X1 > X2 Y1 < Y2 OR
• X1 < X2 Y1 > Y2
= ( C - D )/ (C+D)
Can we estimate what is
the DEGREE OF
AGREEMENT between
two tests?
Two tests were conducted
to measure the Employees Test #1 Test #2
LEADERSHIP TRAITS OF 1 10 11
10 EMPLOYEES. 2 12 15
3 13 14
4 14 14
The data collected shows 5 15 14
the number of traits 6 10 13
possessed by an 7 8 9
employee out of 20 traits. 8 9 9
9 12 10
10 15 15
Can we say looking at the following table that
higher the category of officers, higher the
degree of satisfaction?
Categories of Officers – I means Senior
Post
Satisfaction I II III IV
High 40 60 52 48
Medium 103 87 82 88
Low 57 53 66 64
Revisiting the
Problem!!!
1. You prefer that your life partner’s family
should be financially sound.
2. Love marriage is better than ‘arranged
marriage’
Are respondents
consistent in
their responses?
SPSS Output …
Love marriage is better than 'arranged marriage'. * Your prefer that your life partner's family should be financially sound.
Crosstabulation
Count
Your prefer that your life partner's family should be financially sound.
STRONGLY SOMEWHAT SOMEWHAT STRONGLY
AGREE AGREE NEUTRAL DISAGREE DISAGREE Total
Love marriage STRONGLY AGREE 10 4 12 13 16 55
is better than SOMEWHAT AGREE 15 6 22 8 25 76
'arranged NEUTRAL 20 12 23 8 15 78
marriage'.
SOMEWHAT DISAGREE 28 9 24 18 10 89
STRONGLY DISAGREE 22 14 25 17 5 83
Total 95 45 106 64 71 381
Symmetric Measures
Asymp.
a b
Value Std. Error Approx. T Approx. Sig.
Ordinal by Ordinal Gamma -.197 .049 -3.991 .000
N of Valid Cases 381 What can you
a. Not assuming the null hypothesis.
CONCLUDE?
b. Using the asymptotic standard error assuming the null hypothesis.
THIRD,
MEASURES OF
ASSOCIATION FOR
INTERVAL & RATIO
SCALE DATA...
MEASURES OF ASSOCIATION FOR
INTERVAL AND RATIO SCALE DATA
The degree of correlation between two variables
measured on interval and ratio scale can be measured
through PERASON’S CORRELATION COEFFICIENT which
is -
N XY ( X)( Y )
rxy
N X2 ( X)2 N Y 2 ( Y )2
Value of r Possible Interpretation
0.90 - 1.00 Very Strong Association
0.70 - 0.90 Fairly Strong Association
0.40 - 0.70 Moderate Association
0.20 - 0.40 Weak Association
Less than 0.2 Negligible Association
FOURTH,
MEASURES OF ASSOCIATION
FOR INTERVAL & RATIO
SCALE DATA AND NOMINAL
DATA...
Research Project:
What to do …?
“TV VIEWING HABITS AMONG
WOMEN IN NCR”
Directional Measures
Value
Nominal by Interval Eta STATUS OF WOMEN
RESPONDENT .799
Dependent
AVERAGE TV VIEWING
HOUR PER DAY IN THE .617
LAST WEEK Dependent
Yi X i
Dr. C. P. Gupta
DETERMINISTIC
vs.
STOCHASTIC MODEL
BASIC ASSUMPTIONS:
Zero Mean of the Disturbance: E[ei] = 0 for all i;
Homoscedasticity: Var[ei] = s2, a constant for all i;
Non-autocorrelation: Cov[ei , ej] = 0 if i j;
Uncorrelatedness of regressor and disturbance: Cov[Xi , ej] = 0
if all i and j;
Normality: ei ~ N[0, s2]; and
Non-Stochastic Regressor: the value of Xi is a known constant
in the probability distribution of Yi.
The parameters of the Classical Regression
Model are determined by LEAST SQUARES
METHOD.
b
i ( X i X )( Yi Y )
i ( X i X ) 2
And, the estimate of a, say a, can be
determined as thus:
a Y bX
Let’s do step-by-step Regression
Analysis …
Trying to establish a Relation between the
Interest Rates and Futures Index
Day Interest Rate Futures Index
1 7.43 221
2 7.48 222
3 8.00 226
4 7.75 225
5 7.60 224
6 7.63 223
7 7.68 223
8 7.67 226
9 7.59 226
10 8.07 235
11 8.03 233
12 7.25 325
13 8.00 241
Step No.#1: Do we have sufficient
evidence to fit a Linear Regression Model?
It is an OUTLIER!!!!!
Identify an outlier and remove it…
Day Interest Rate Futures Index
1 7.43 221
2 7.48 222
3 8.00 226
4 7.75 225
5 7.60 224
6 7.63 223
7 7.68 223
8 7.67 226
9 7.59 226
10 8.07 235
11 8.03 233
12 7.25 325
13 8.00 241
Removing the outlier we get the final data
for Regression Analysis …
Day Interest Rate Futures Index
1 7.43 221
2 7.48 222
3 8.00 226
4 7.75 225
5 7.60 224
6 7.63 223
7 7.68 223
8 7.67 226
9 7.59 226
10 8.07 235
11 8.03 233
13 8.00 241
Using the Least Square Method, we get …
Covariance(x, y)
Estimate of
Variance(x )
Using Scientific Calculator, one can get ---
Covariance (Interest Rate and Futures Index) = 1.0180; and Variance
of Interest Rate = 0.0462.
Therefore, the estimate of Beta is: 22.0307
Using the Least Square Method, we get …
Estimate of y x
Interest Rate
Will our story of Regression
Analysis end here?
NO!
We shall have a beginning of … a
NEW STORY!
Before we proceed further, we must ensure –
‘how best is our line of BEST FIT?’
Futures Index
Interest Rate
For that, we need a tool…
R =R Square!!!!
2
Higher the value of R2, higher the variation explained and hence, it
is a better fit.
It is good that R2 can explain about the
GOODNESS of FIT. But, whatever is
explained how can I believe that it
would be statistically significant?!!!!!
But, why
For that ANOVA in
one can Regression?
use
ANOVA!!!!!!
ANALYSIS OF VARIANCE
ANOVA TABLE
Sources of Variation Variation Degrees of Freedom Mean Squre F-Ratio
Ratio of
Regression SSR K SSR/K
Mean
Residuals SSE n - (K+1) SSE/(n-(K+1)) Squares
Total SST n-1 SST/(n-1)
Summarizing…
Evaluating the FIT of the Regression!
ANOVA
df SS MS F Significance F
Regression 1 269.1232 269.1 19.82 0.0012
Residual 10 135.7934 13.58
Total 11 404.9167
Coefficients Standard Error t Stat P-value
Intercept 56.4740 38.3384 1.473 0.172
Interest Rate 22.0307 4.9487 4.452 0.001
Once we get the Regression Line and
assuming that it is the BEST FITTED LINE,
Then WHAT?
Where to go?
One can use Regression Analysis
for …
One, Establishing a relation between the
variables and estimate the values.
ANOVA
df SS MS F Significance F
Regression 1 1384.805684 1384.81 3.89541 0.063970444
Residual 18 6398.944316 355.497
Total 19 7783.75
Coefficients Standard Error t Stat P-value
Intercept 98.6206 18.1639 5.42948 3.7E-05
Age(Years) -4.0081 2.0308 -1.9737 0.06397
EXCEL OUTPUT……
SUMMARY OUTPUT
1.What
is the
Regression Statistics Regression
Line?
Multiple R 0.7262
R Square 0.5274
Adjusted R Square 0.5012
2.How
well the
Standard Error 2.0738
Regression
Line
Fit
Observations 20
the Data?