Professional Documents
Culture Documents
Synopsis - QM2 SAS Outputs PDF
Synopsis - QM2 SAS Outputs PDF
= + + +
+
H0: = = = =
Sum of Mean
Source DF F Value Pr > F
Squares Square
Model or Regression p A D F <.0001
Error np1 B E
Total n1 C
Interpretation of P-value in ANOVA: If p-value (i.e. Pr >F) is smaller than level of significance
(), then we can reject H0. In other words, the model is significant or there is a significant
relationship between the dependent variable (response variable) and a set of independent
variables (predictors / explanatory variables).
Note that F-test is used to test the significance of overall relationship (i.e. the model) and t-
test is used to test the individual relationship/influence of an independent variable to the
dependent variable. (i.e. to test the significance of individual betas)
Interpretation of R2: A value close to 1 indicates goodness of fit of the model. Adjusted R-
Square is preferred to R-Square as there is penalty in Adjusted for having too many
independent variables in regression model. Though we get a high R-Square, significance of
the model has to be carried out using F-test to conclude on the overall fitness of the model
to the given data.
%1
= 1 "#1 $ (
%'1
Where 0 is the coefficient of determination for the regression of kth predictor variable
(treating kth explanatory variable as dependent on the remaining (p-1) explanatory
variables).
1 1
123 = 4 100 = 4 100
)*+,-%. 1 0
Interpretation: If 0 is large, then tolerance is very small implies VIF is very large. Hence, kth
variable is correlated to other predictors, if VIF is large. As a rule of thumb, we say there is a
significant Multicollinearity due to a variable if its VIF is larger than 10. (i.e. tolerance is less
than 0.10 implies 0 is larger than 0.90)
Example:
Parameter Estimates
Parameter Standard Standardized Variance
Variable DF Estimate Error t Value Pr > |t| Estimate Tolerance Inflation
Intercept 1 -8.62347 5.90982 -1.46 0.1785 0 . 0
x2 1 0.09251 0.03912 0.0423 0.24063 0.98511 1.01511
x1 1 3.24751 0.36993 8.78 <.0001 0.89323 0.98511 1.01511
Step1: Test the discrimination ability of each independent variable using an F-test. Here
we test the equality of means of independent variables in the two groups.
Example:
We note that all variables except Complaint Resolution and Warranty & Claims have their
means significantly different in both the groups. That is, there exists a significant deference
in the means of Product Quality in group 1 and group 2. Similarly there exists significant
deference in the means of other variables in the two groups. Hence, Product Quality,
Advertising, Sales force Image, Competitive Pricing can be used to discriminate the two
groups. Note that, we need to drop Complaint Resolution and Warranty & Claims in the
Discriminant model. Also, note that each row in the above table is summary from the
ANOVA table of the respective variable.
This is very much similar to ANOVA of a regression model. A small Wilks Lambda indicates
the significance of the Discriminant model. Alternatively, if p-value of Wilks Lambda is very
small, it indicates the significance of the Discriminant model.
Example:
In the above table we observe significance of Wilks Lambda. Therefore, the Discriminant
model we developed is useful in discriminating the groups.
We obtain values of the dependent variable by supplying the values of the independent
variable as inputs to the Discriminant functions. If we have two groups then we will get two
Discriminant functions. By comparing the two values, we classify the new observation in to
the group that has the highest value of the Discriminant function for the group.
Example:
We have two Discriminant functions for each of the two groups, National Brand and Private
label. A new observation with values on Product Quality, Advertising, Sales force Image and
Competitive Pricing is supplied to the two functions. If the Discriminant function for Private
Label yields a higher classification score as compared to National brand, then classify the
new observation into Private Label.
The performance of a Discriminant function is evaluated using Hit Ratio, the ratio of the
correct classification.
Example:
Correlation matrix of the variables gives a rough idea on whether the variables are
factorable. A high correlation among the variables is expected. We use Kaiser-Meyer-
Olkin Measure of Sampling Adequacy (KMO) and Bartletts tests to confirm further on the
applicability of Factor Analysis as the test the strength of the relationship among variables.
Interpretive adjectives for the Kaiser-Meyer-Olkin Measure of Sampling Adequacy are: in
the 0.90 as marvellous, in the 0.80's as meritorious, in the 0.70's as middling, in the 0.60's as
mediocre, in the 0.50's as miserable, and below 0.50 as unacceptable. If both the test
statistics are significant, then we can proceed to Factor Analysis.
Example:
0.712 0.748 0.511 0.588 0.484 0.725 0.512 0.473 0.730 0.645 0.758 0.429 0.414
'Communalities' tell us how much of the variance in each of the original variables is
explained by the extracted factors. Higher communalities are desirable. If the communality
for a variable is less than 50%, it is a candidate for exclusion from the analysis because the
factor solution contains less that half of the variance in the original variable, and the
explanatory power of that variable might be better represented by the individual variable. If
we did exclude a variable for a low communality (less than 0. 50), we should re-run the
factor analysis without that variable before proceeding.
Example:
The above values, 3.246307, 2.103337, 1.644186, 1.185304 and 1.102832, are also the first
five Eigen values extracted by principal component method. Note that mathematically the
number of factors equals to number of variables (In this example we have 13 variables),
however we analyse few factors decided by the criterion of minimum Eigen value to be
larger than 1 or by the Scree plot.
<.>:
In the above table, variance explained by the first factor is 24.97% (= 100) and total
<
<.>:7.<7.;><7.?@7.<
variance explained by the five factors is 71.4% (= 100$. This
<
can be obtained from the following table also. Note that the cumulative value in row 5 of
the table below is 0.714 or 71.4% and the variance explained by first factor is 25% as the
proportion in first row is 0.25 which close to the value we computed above, 24.97%.
From the table below on initial factor solution, we note that the communality of variable X1
(i.e. the total variance captured by the five factors in variable X1) is 0.860643 or 86% (the
sum of squares of the first row, i.e. against variable X1, in the below table). Further if we
add the communalities of all the variables we get 9.281 (i.e. 0.860643 + )
Also note that each entry in the below table represents the factor loadings, which is also the
correlation between a variable and a factor. For example, as the first four variables are
highly correlated with factor1 (0.888, 0.788, 0.774, and 0.770), they are affected strongly by
factor1. Also note that the correlation between any two factors is always zero (As we extract
orthogonal factors the other method being Oblique).
Note the cross loadings of variables on different factors. For example, variable X5 has
loadings on Factor1, Factor2 and Factor5, while variable X13 has high loadings on factor4
and factor5. This leads us to a dilemma on the right group of variables under a factor. This
can be resolved to a great extent by factor rotation.
The idea of rotation is to reduce the number factors on which the variables under
investigation have high loadings (i.e. Cross loadings). Rotation does not actually change
anything but makes the interpretation of the analysis easier. In other words, rotation helps
us to classify each variable under a factor with much ease.
For a better idea on classifying the variables under each factor, compare the two tables
Factor Pattern given above and Rotated Factor Pattern given below.
Based on the initial factor pattern and rotated factor pattern, we list the variables for each
factor as follows:
Observe the loadings of variable X5 in the initial factor pattern and rotated factor pattern to
have good idea on cross loadings.
Also note that the total variance explained by the five factors remains same at 71.4% (i.e.
(3.164 +1.798 + + 1.347) x 100 / 13). Also note the diminishing importance of the factors
given in the table below.
0.860 0.708 0.790 0.649 0.759 0.708 0.704 0.604 0.502 0.780 0.720 0.661 0.836
Observe from the table on final Communality estimates that the total communality remains
at 9.281 (0.860 + 0.708 + + 0.836). Compare the communalities obtained for the loadings
given in the earlier table on Factor Pattern.
Correlation matrix gives a fair idea about the relationship among variables and their
contribution for clustering the Population.
Example:
In the following table, we note that three canonical variables are identified and the first two
Eigen values account for a total of 93.21% variation in the data (See the last column of the
table below).
In the table below, the first column is Number of Clusters, the second and third columns are
the observation/cluster number merged to form a new cluster, fourth column represent the
number of observations in the new cluster formed, SPRSQ stands for Semi Partial R-Square,
RSQ stands for R-Square and the last column indicates whether there is a tie. A tie at the
initial stages do not affect the clustering much, but tie at the middle or at the end of the
clustering stages impact the clustering as a tie indicate other competing observations to join
a cluster. In such cases we may consider permutation of observations/clusters to identify
the best way to cluster the given data.
Cluster History
Since the objective of cluster analysis is to form homogeneous groups, the Root Mean
Squared Pooled Standard Deviation of a cluster should be as small as possible. SPRSQ (semi
Partial R-squared) is a measure of the homogeneity of merged clusters, so SPRSQ is the loss
of homogeneity due to combining two groups or clusters to form a new group or cluster.
The number of cluster is identified by reading the values of SPRSQ. Intuitively, SPRSQ jumps
to a high value if we are combining two or more heterogeneous groups.
groups. Therefore, we need
to observe the jumps in SPRSQ column. We notice jumps at no. of clusters (NCL) Seven
(from 0.028 to 0.0461) and Three (from
( 0.0603 to 0.1309) and One (from 0.1593 to 0.2893).
Therefore, we have two choices for the number clusters Three Clusters or Seven Clusters.
Ideally we group the observations into 3 or 4 clusters; we go with clustering of the data into
THREE Clusterss in this example. This can be observed in the Dendrogram given below.
Reading from the left of the above Dendrogram, we list the observations in each cluster as
follows.
Constraints
Final Shadow Constraint Allowable Allowable
Cell Name Value Price R.H. Side Increase Decrease
$B$11 Fan Motors Quantity Used 200 31 200 80 40
$B$12 Cooling coils Quantity Used 320 32 320 80 120
$B$13 Manufacturing time Quantity Used 2080 0 2400 1E+30 320