Professional Documents
Culture Documents
R2 represents the fit of the data. How much does the model explain the variation in the dependent
variable. Any R2 above 0.5 is considered a good fit.
Durbin Watson represents the autocorrelation of the data. This means that there will be correlation
between the error terms of the data. This is an undesirable property of regression, because, the
autocorrelation will bias the regression estimates, ie give you wrong predictions.
So check of autocorrelation is done by DW, a value of Durbin Watson around 2 is preferred. If the
DW is below 2, there is positive autocorrelation and if the DW is above 2 means there is negative
autocorrelation. A DW od 2 indicates there is no autocorrelation. The edge values of autocorrelation
are between 0 and 4.
ANOVA
The H0 for Anova test is: The model is not a good fit
Here we do not have F value table, therefore we are looking at the significance value.
Colliniearity Statistics
To test whether there is heteroskedasticity of data. Hsk is the residuals are cone shaped, that is they
are showing a positive or a negative trend (cone ) with the data. This is also bad for estimates and
predictions because they give biased estimates. If the value of VIF is greater than 10, the data is HSK.
Tolerance = 1/VIF
Regression Equation
Y =a +b1X1+ b2X2+b3X3+e
Salary = 17.143 +1.589 *(No of Years) + 3.643 * (Gender) – 0.89 * (Gen*no of years) + E
Significance
Constant
Ho : The constant is not significant
P value test is
Constant is significant
Gender is significant
FACTOR ANALYSIS
KMO Test
This checks the adequacy of the data to conduct the factor analysis. If the value of KMO is above 0.5,
we can proceed to conduct the factor analysis.
H0: The correlation matrix is an identity matrix, we cannot proceed with factor analysis
Ha: The correlation matrix is not an identity matrix, we can proceed with factor analysis
P value test
Communalities
A communality is the extent to which an item correlates with all other items. Higher communalities are
better. If communalities for a particular variable are low (between 0.0-0.4), then that variable may
struggle to load significantly on any factor.
No Communalities are below 0.4, therefore, the extraction results will be good.
Factor Extraction
Any factor with an eigen value greater than 1 is extracted. The total variance of extraction in this
case is typed in the last column, last row and the individual variances explaned by each factor are
in the 2nd last column, each row.
Scree Plot
Whereever there is a tight bend, there you stop extraction. Here the tight bend cannot be
recognized, therefore we take the eigen value which is represented in the y axis, eigen value>1,
we extract. Therefore we are extracting 6 factors here.
Communalities will be represented as H2. Rotated component matrix values will be given as p1, p2
etc. F1, f2 – factor1, factor2.
Here we are extracting component with a value greater the 0.5 (take modulus/absolute values)
2nd factor – engine capacity, fuel efficiency, running and maintenance cost, economical
3rd factor – brand name, after sales service, performance information available
4th factor – purpose of purchase, car image and positioning, advertising and marketing
Latent Constructs
Name that you can give for each of the factor extracted
Ho: The covariance matrix of the three types of flowers are not different (same)
Ha: The covariance matrix of the three types of flowers are different
P test value
Wilks Lambda
H0: There is no discriminating power for the 4 variables (PL, PW, SL, SW)
Ha: There is discriminating power for the 4 variables (PL, PW, SL, SW)
P test here
1st Function
D1 =0.608PW + 0.887PL-0.518SW-0.359SL
2nd Function
D2 = 0581PW-0.431PL+0.695SW+0.093SL
1st Function
2nd Function
Function 1 has more variance between the three types of flowers, therefore, function 1 is a better
discriminant function.
------------------------------------------------------------------------------------------------------------------------------------