You are on page 1of 14

Synopsis of QM2 Interpretation of Output

I. Multiple Regression Model:


Model:

 =  +   + +  
+

Note that there are p parameter and an intercept.

H0: =  = =  = 

H1: Not all  are zero

Sum of Mean
Source DF F Value Pr > F
Squares Square
Model or Regression p A D F <.0001
Error np1 B E
Total n1 C

A = Sum of squares due to regression model = SSR

B = Sum of squares due to error = SSE

C = Total sum of squares = SST

C=A+B [Since, SST = SSR + SSE]

D=A/p [MSR = SSR /df due to regression]

E = B / (n-p-1) [MSE= SSE / df due to error]

F=D/E [F = MSR / MSE]

Root Mean Squared Error / Standard error of the estimate = 

Interpretation of P-value in ANOVA: If p-value (i.e. Pr >F) is smaller than level of significance
(), then we can reject H0. In other words, the model is significant or there is a significant
relationship between the dependent variable (response variable) and a set of independent
variables (predictors / explanatory variables).

Influence of an Independent variable on Dependent variable:

Note that F-test is used to test the significance of overall relationship (i.e. the model) and t-
test is used to test the individual relationship/influence of an independent variable to the
dependent variable. (i.e. to test the significance of individual betas)

Compiled by Prof. KVSSN Narasimha Murty


To know the influence of a predictor on the response variable, we compute the
standardized estimate of BETAs. A larger std. estimate of BETA indicates the corresponding
variable has greater influence on the dependent variable.

Goodness of Fit of the Model:

Coefficient of Determination of the model, R2 = SSR / SST.

Interpretation of R2: A value close to 1 indicates goodness of fit of the model. Adjusted R-
Square is preferred to R-Square as there is penalty in Adjusted   for having too many
independent variables in regression model. Though we get a high R-Square, significance of
the model has to be carried out using F-test to conclude on the overall fitness of the model
to the given data.

%1
   = 1 "#1   $ (
%'1

Testing Multicollinearity in Multiple Regression:

We use tolerance and Variance Inflation Factor (VIF) to check multicollinearity.

)*+,-%. /-+ = 1 0

Where 0 is the coefficient of determination for the regression of kth predictor variable
(treating kth explanatory variable as dependent on the remaining (p-1) explanatory
variables).

1 1
123 = 4 100 = 4 100
)*+,-%. 1 0

Interpretation: If 0 is large, then tolerance is very small implies VIF is very large. Hence, kth
variable is correlated to other predictors, if VIF is large. As a rule of thumb, we say there is a
significant Multicollinearity due to a variable if its VIF is larger than 10. (i.e. tolerance is less
than 0.10 implies 0 is larger than 0.90)

Example:

Parameter Estimates
Parameter Standard Standardized Variance
Variable DF Estimate Error t Value Pr > |t| Estimate Tolerance Inflation
Intercept 1 -8.62347 5.90982 -1.46 0.1785 0 . 0
x2 1 0.09251 0.03912 0.0423 0.24063 0.98511 1.01511
x1 1 3.24751 0.36993 8.78 <.0001 0.89323 0.98511 1.01511

X1 has greater influence on dependent variable as Std. estimate of x1 (0.89323) is larger


than the std. estimate of x2 (0.24063).

Compiled by Prof. KVSSN Narasimha Murty


Also note that the model does not suffer from multicollinearity as Variance Inflation (VIF) of
both variables are quiet small (less than 10).

Important Note: In regression, Dependent variable is metric whereas Independent


variable can be any type (Metric / Categorical).

II. Discriminant Analysis


In Discriminant analysis, we identify a regression line that separates the whole population
into two groups. Therefore, we need to test the significance of the Discriminant model (i.e.
the regression line that discriminates the two groups) as done in the regression analysis.

Step1: Test the discrimination ability of each independent variable using an F-test. Here
we test the equality of means of independent variables in the two groups.

Example:

Table 1: Univariate Test Statistics


F Statistics, Num DF=1, Den DF=198
Total Pooled Between R-Square
Standard Standard Standard R- / (1-RSq) F
Variable Deviation Deviation Deviation Square Value Pr > F
Product Quality 1.383 1.1634 1.0614 0.296 0.4204 83.24 <.0001
Complaint 1.21 1.213 0.00623 0 0 0 0.9591
Resolution
Advertising 1.1471 1.1209 0.3613 0.0499 0.0525 10.39 0.0015
Sales force Image 1.1286 1.047 0.6034 0.1436 0.1677 33.21 <.0001
Competitive 1.5813 1.2808 1.3144 0.3472 0.5319 105.3 <.0001
Pricing 2
Warranty and 0.8753 0.8774 0.0212 0.0003 0.0003 0.06 0.8094
Claims

We note that all variables except Complaint Resolution and Warranty & Claims have their
means significantly different in both the groups. That is, there exists a significant deference
in the means of Product Quality in group 1 and group 2. Similarly there exists significant
deference in the means of other variables in the two groups. Hence, Product Quality,
Advertising, Sales force Image, Competitive Pricing can be used to discriminate the two
groups. Note that, we need to drop Complaint Resolution and Warranty & Claims in the
Discriminant model. Also, note that each row in the above table is summary from the
ANOVA table of the respective variable.

Compiled by Prof. KVSSN Narasimha Murty


Step2: Compute Wilks Lambda to know the model significance in discriminating the
groups.

This is very much similar to ANOVA of a regression model. A small Wilks Lambda indicates
the significance of the Discriminant model. Alternatively, if p-value of Wilks Lambda is very
small, it indicates the significance of the Discriminant model.

Example:

Table 2: Multivariate Statistics and Exact F Statistics


S=1 M=2 N=95.5
Statistic Value F Value Num DF Den DF Pr > F
Wilks' Lambda 0.48824801 33.72 6 193 <.0001

In the above table we observe significance of Wilks Lambda. Therefore, the Discriminant
model we developed is useful in discriminating the groups.

Step3: Classification using the Discriminant model.

We obtain values of the dependent variable by supplying the values of the independent
variable as inputs to the Discriminant functions. If we have two groups then we will get two
Discriminant functions. By comparing the two values, we classify the new observation in to
the group that has the highest value of the Discriminant function for the group.

Example:

We have two Discriminant functions for each of the two groups, National Brand and Private
label. A new observation with values on Product Quality, Advertising, Sales force Image and
Competitive Pricing is supplied to the two functions. If the Discriminant function for Private
Label yields a higher classification score as compared to National brand, then classify the
new observation into Private Label.

Table 4: Linear Discriminant Function for Brand Shoppers


Variable National Brand Private Label
Constant -75.74192 -79.47109
Product Quality (P) 6.56466 5.54115
Advertising (A) 1.27597 1.30033
Sales force Image (S) 1.15046 2.10374
Competitive Pricing (C) 5.37316 6.37743

Discriminant function for National Brand = -75.74192 + 6.56466 P + 1.27597 A + 1.15046 S +


5.37316 C

Discriminant function for Private Label = -79.47109 + 5.54115 P + 1.30033 A + 2.10374 S +


6.37743 C

Compiled by Prof. KVSSN Narasimha Murty


Step4: Performance of Discriminant function:

The performance of a Discriminant function is evaluated using Hit Ratio, the ratio of the
correct classification.

Example:

Table 3: Number of Observations and Percent Classified


into Brand Shoppers
From Brand Shoppers National Brand Private Label Total
National Brand x 4 81
Private Label 20 y 119
Total 97 103 n = 200

X = (81 4) or (97 20) = 77 and y = (119 20) or (103 4) = 99.


678 :;
Therefore, Hit Ratio = = (77 + 99) / 200 = = 88 %, Seems good prediction by the
9 
model.

Important Note: In Discriminant Analysis, the dependent variable is always categorical,


while the independent variables are continuous or binary.

III. Factor Analysis


Factor Analysis is using in identifying latent variables, which express the underlying
relationship among variables. Note that for grouping of variables we use Factor Analysis
while Cluster Analysis for grouping of observations.

Step1: Check whether the variables are factorable.

Correlation matrix of the variables gives a rough idea on whether the variables are
factorable. A high correlation among the variables is expected. We use Kaiser-Meyer-
Olkin Measure of Sampling Adequacy (KMO) and Bartletts tests to confirm further on the
applicability of Factor Analysis as the test the strength of the relationship among variables.
Interpretive adjectives for the Kaiser-Meyer-Olkin Measure of Sampling Adequacy are: in
the 0.90 as marvellous, in the 0.80's as meritorious, in the 0.70's as middling, in the 0.60's as
mediocre, in the 0.50's as miserable, and below 0.50 as unacceptable. If both the test
statistics are significant, then we can proceed to Factor Analysis.

Example:

Kaiser's Measure of Sampling Adequacy: Overall MSA = 0.640

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13

0.712 0.748 0.511 0.588 0.484 0.725 0.512 0.473 0.730 0.645 0.758 0.429 0.414

Compiled by Prof. KVSSN Narasimha Murty


As the overall KMO = 0.64 is greater than 0.6, one can proceed with Factor Analysis.
However, variables X5, X8, X12 and X13 have poor Kaisers measure. If we are using 5
factors then there should be at least 15 variables as we need a minimum of 3 variables for
each factor. If we have too few variables satisfying Kaiser Criterion, we need to increase the
sample size or include more variables.

Step2: Interpretation of Communalities / variance explained by factors

'Communalities' tell us how much of the variance in each of the original variables is
explained by the extracted factors. Higher communalities are desirable. If the communality
for a variable is less than 50%, it is a candidate for exclusion from the analysis because the
factor solution contains less that half of the variance in the original variable, and the
explanatory power of that variable might be better represented by the individual variable. If
we did exclude a variable for a low communality (less than 0. 50), we should re-run the
factor analysis without that variable before proceeding.

Example:

Variance Explained by Each Factor

Factor1 Factor2 Factor3 Factor4 Factor5

3.247 2.103 1.643 1.185 1.103

The above values, 3.246307, 2.103337, 1.644186, 1.185304 and 1.102832, are also the first
five Eigen values extracted by principal component method. Note that mathematically the
number of factors equals to number of variables (In this example we have 13 variables),
however we analyse few factors decided by the criterion of minimum Eigen value to be
larger than 1 or by the Scree plot.
<.>:
In the above table, variance explained by the first factor is 24.97% (=  100) and total
<
<.>:7.<7.;><7.?@7.<
variance explained by the five factors is 71.4% (=  100$. This
<
can be obtained from the following table also. Note that the cumulative value in row 5 of
the table below is 0.714 or 71.4% and the variance explained by first factor is 25% as the
proportion in first row is 0.25 which close to the value we computed above, 24.97%.

Compiled by Prof. KVSSN Narasimha Murty


Eigenvalues of the Correlation Matrix: Total = 13 Average = 1

Eigenvalue Difference Proportion Cumulative

1 3.247 1.144 0.250 0.250

2 2.103 0.460 0.162 0.411

3 1.643 0.458 0.126 0.538

4 1.185 0.0828 0.091 0.629

5 1.103 0.252 0.085 0.714

6 0.850 0.140 0.065 0.779

7 0.711 0.149 0.055 0.834

8 0.562 0.067 0.043 0.877

9 0.495 0.088 0.038 0.915

10 0.407 0.104 0.031 0.947

11 0.303 0.082 0.023 0.970

12 0.220 0.049 0.017 0.987

13 0.171 0.013 1.000

From the table below on initial factor solution, we note that the communality of variable X1
(i.e. the total variance captured by the five factors in variable X1) is 0.860643 or 86% (the
sum of squares of the first row, i.e. against variable X1, in the below table). Further if we
add the communalities of all the variables we get 9.281 (i.e. 0.860643 + )

Also note that each entry in the below table represents the factor loadings, which is also the
correlation between a variable and a factor. For example, as the first four variables are
highly correlated with factor1 (0.888, 0.788, 0.774, and 0.770), they are affected strongly by
factor1. Also note that the correlation between any two factors is always zero (As we extract
orthogonal factors the other method being Oblique).

Compiled by Prof. KVSSN Narasimha Murty


Factor Pattern

Factor1 Factor2 Factor3 Factor4 Factor5

X1 0.888 -0.105 -0.203 -0.037 0.136

X2 0.788 0.156 0.140 0.019 -0.204

X3 0.774 0.218 0.077 0.192 -0.139

X4 0.770 -0.136 -0.312 -0.035 -0.098

X5 0.581 -0.469 0.210 -0.056 0.418

X6 -0.101 0.837 -0.001 -0.281 -0.023

X7 -0.049 0.600 0.500 0.298 -0.064

X8 0.256 0.536 -0.528 0.336 -0.122

X9 0.402 0.418 0.392 -0.103 0.028

X10 0.141 -0.041 0.625 0.293 0.403

X11 -0.003 0.235 -0.571 0.014 0.472

X12 0.119 0.360 0.067 -0.753 0.348

X13 -0.183 0.270 -0.180 0.445 0.559

Note the cross loadings of variables on different factors. For example, variable X5 has
loadings on Factor1, Factor2 and Factor5, while variable X13 has high loadings on factor4
and factor5. This leads us to a dilemma on the right group of variables under a factor. This
can be resolved to a great extent by factor rotation.

Step3: Factor Rotation

The idea of rotation is to reduce the number factors on which the variables under
investigation have high loadings (i.e. Cross loadings). Rotation does not actually change
anything but makes the interpretation of the analysis easier. In other words, rotation helps
us to classify each variable under a factor with much ease.

For a better idea on classifying the variables under each factor, compare the two tables
Factor Pattern given above and Rotated Factor Pattern given below.

Compiled by Prof. KVSSN Narasimha Murty


Rotated Factor Pattern

Factor1 Factor2 Factor3 Factor4 Factor5

X1 0.875 -0.192 0.191 0.107 0.096

X2 0.803 -0.271 -0.022 -0.007 -0.012

X3 0.784 0.301 0.005 -0.021 -0.050

X4 0.781 0.243 0.025 -0.188 0.054

X5 -0.068 0.834 -0.057 0.025 -0.002

X6 0.343 0.510 0.096 -0.097 0.325

X7 0.456 -0.210 0.725 -0.014 0.056

X8 0.008 0.455 0.665 0.069 -0.081

X9 0.402 0.162 -0.578 0.474 -0.112

X10 -0.170 0.183 0.097 0.753 -0.102

X11 0.048 -0.231 -0.148 0.694 0.210

X12 0.034 0.004 0.050 0.021 0.912

X13 -0.069 0.467 -0.477 0.150 0.564

Based on the initial factor pattern and rotated factor pattern, we list the variables for each
factor as follows:

Based on Factor Pattern table Based on Rotated Factor Pattern table


F1: X1, X2, X3, X4, X5 F1: X1, X2, X3, X4
F2: X6, X7, X8 F2: X5, X6
F3: X10, X11 F3: , X7, X8, X9
F4: X12 F4: X10, X11
F5: X13 F5: X12, X13

Observe the loadings of variable X5 in the initial factor pattern and rotated factor pattern to
have good idea on cross loadings.

Also note that the total variance explained by the five factors remains same at 71.4% (i.e.
(3.164 +1.798 + + 1.347) x 100 / 13). Also note the diminishing importance of the factors
given in the table below.

Compiled by Prof. KVSSN Narasimha Murty


Variance Explained by Each Factor

Factor1 Factor2 Factor3 Factor4 Factor5

3.164 1.798 1.613 1.359 1.347

Final Communality Estimates: Total = 9.281

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13

0.860 0.708 0.790 0.649 0.759 0.708 0.704 0.604 0.502 0.780 0.720 0.661 0.836

Observe from the table on final Communality estimates that the total communality remains
at 9.281 (0.860 + 0.708 + + 0.836). Compare the communalities obtained for the loadings
given in the earlier table on Factor Pattern.

IV. CLUSTER ANALYSIS


Cluster is a very useful tool in segregating a population into homogeneous groups known as
clusters. In Discriminant analysis we use a classification function known as Linear
Discriminant Function to classify the subjects of a population into known groups which are
homogeneous. In cluster analysis such groups are unknown before hand and we attempt to
identify the groups/clusters. However, such homogeneous groups are known before hand in
Discriminant Analysis.

Step1: Obtain Correlation matrix and its Eigen values

Correlation matrix gives a fair idea about the relationship among variables and their
contribution for clustering the Population.

Example:

In the following table, we note that three canonical variables are identified and the first two
Eigen values account for a total of 93.21% variation in the data (See the last column of the
table below).

Compiled by Prof. KVSSN Narasimha Murty


Step2: Determination of Number of Clusters

In the table below, the first column is Number of Clusters, the second and third columns are
the observation/cluster number merged to form a new cluster, fourth column represent the
number of observations in the new cluster formed, SPRSQ stands for Semi Partial R-Square,
RSQ stands for R-Square and the last column indicates whether there is a tie. A tie at the
initial stages do not affect the clustering much, but tie at the middle or at the end of the
clustering stages impact the clustering as a tie indicate other competing observations to join
a cluster. In such cases we may consider permutation of observations/clusters to identify
the best way to cluster the given data.

Cluster History

NCL Clusters Joined FREQ SPRSQ RSQ Tie


19 OB2 OB3 2 0.0074 0.993 T
18 CL19 OB11 3 0.0091 0.984
17 OB7 OB8 2 0.0099 0.974 T
16 OB13 OB17 2 0.0099 0.964
15 OB14 OB15 2 0.0148 0.949
14 CL17 OB9 3 0.0165 0.932
13 OB19 OB20 2 0.0173 0.915
12 CL18 OB5 4 0.0193 0.896
11 OB16 OB18 2 0.0198 0.876
10 CL16 CL15 4 0.0247 0.851
9 CL12 OB4 5 0.0274 0.824
8 OB6 CL14 4 0.028 0.796
7 CL10 CL11 6 0.0461 0.75
6 CL7 CL13 8 0.0533 0.697
5 OB1 OB12 2 0.0568 0.64 T
4 CL9 CL8 9 0.0603 0.58
3 CL5 OB10 3 0.1309 0.449
2 CL4 CL6 17 0.1593 0.289
1 CL3 CL2 20 0.2893 0

We expect a large RSQ at the beginning of clustering as we have greater homogeneity of


observations in each cluster and a Zero RSQ at the final stage of clustering as we combined
heterogeneous observations into a single cluster (Observe a high RSQ (0.993) when we use
19 clusters and a zero RSQ when we have only 1 cluster).

Since the objective of cluster analysis is to form homogeneous groups, the Root Mean
Squared Pooled Standard Deviation of a cluster should be as small as possible. SPRSQ (semi
Partial R-squared) is a measure of the homogeneity of merged clusters, so SPRSQ is the loss
of homogeneity due to combining two groups or clusters to form a new group or cluster.

Compiled by Prof. KVSSN Narasimha Murty


Thus, the SPRSQ value should be small to imply that we are merging two homogeneous
groups.

The number of cluster is identified by reading the values of SPRSQ. Intuitively, SPRSQ jumps
to a high value if we are combining two or more heterogeneous groups.
groups. Therefore, we need
to observe the jumps in SPRSQ column. We notice jumps at no. of clusters (NCL) Seven
(from 0.028 to 0.0461) and Three (from
( 0.0603 to 0.1309) and One (from 0.1593 to 0.2893).
Therefore, we have two choices for the number clusters Three Clusters or Seven Clusters.
Ideally we group the observations into 3 or 4 clusters; we go with clustering of the data into
THREE Clusterss in this example. This can be observed in the Dendrogram given below.

Reading from the left of the above Dendrogram, we list the observations in each cluster as
follows.

Cluster No. Observations


1 OB13,OB17, OB14, OB15, OB16, OB18, OB19, and OB20
2 OB2, OB3, OB11, OB5, OB4, OB6, OB7, OB8 and OB9
3 OB1, OB12 and OB10

Compiled by Prof. KVSSN Narasimha Murty


V. LINEAR PROGRAMMING PROBLEM
Consider the problem 12 in page 527: The problem is about deciding the quantity of the following models of Air Coolers to be manufactured to
maximize profit.

E: Economy Model; S: Standard Model D: Delux Model

Max 63E + 95S + 135D


S.t.
1E + 1S + 1D 200 Fan Motors
1E + 2S + 4D 320 Cooling coils
8E + 12S + 14 D 2400 Manufacturing Time
E, S, D 0
Adjustable Cells
Final Reduced Objective Allowable Allowable
Cell Name Value Cost Coefficient Increase Decrease
$B$8 No. of AirConditioners Economy Model 80 0 63 12 15.5
$C$8 No. of AirConditioners Standard Model 120 0 95 31 8
$D$8 No. of AirConditioners Delux Model 0 -24 135 24 1E+30

Constraints
Final Shadow Constraint Allowable Allowable
Cell Name Value Price R.H. Side Increase Decrease
$B$11 Fan Motors Quantity Used 200 31 200 80 40
$B$12 Cooling coils Quantity Used 320 32 320 80 120
$B$13 Manufacturing time Quantity Used 2080 0 2400 1E+30 320

Note that 1E+30 stands for INFINITY, i.e. No Limit

Compiled by Prof. KVSSN Narasimha Murty


A) Current Optimal Solution to the problem is in the column Final Value: #Economy models = 80, #Standard models = 120, and #Delux
models = 0.
B) Maximum Profit = Sumproduct(Final Value, Objective Coefficient) = $16440
C) Note that the decision variable $D$8 is not used in the optimal solution, because its Final Value = 0.
D) The current solution is optimal as long as the objective coefficients are in the range of: (63-15.5 = 47.5) E (63+12 = 75); 87 S 126;
D 159. Note that there is no lower limit for Obj. Coefficient of Delux model as #Delux models = 0 in final solution. The allowable
changes in a coefficient are valid provided all other coefficients remain fixed at their current values and the presence of a 0 in any
Allowable Increase or Decrease indicates that alternative optimal solutions exist.
E) The current solution will be still optimal for these combinations of obj. coefficients: (48, 95, 135), (70, 95, 135), (63, 100, 135) or (63, 95,
150). Note that there is a change in one obj. coefficient at a time satisfying the ranges given above.
F) If we wish to have simultaneous changes the obj. coefficients, we use 100% rule. For example consider the obj. coefficients (70, 100,
135). The % change in obj. coefficient of E is 58.33% [= (70-63) * 100 /(Allowable increase =12)] and % change in obj. coefficient of S is
16.13%
[=(100-95)*100/Allowable increase = 31)], implying a total change of 74.46% (=58.33% + 16.13%) which is less than 100%. Similarly you
can try simultaneous decrease/increase in obj. coefficients subject to a total change less than 100%.
G) The Reduced Cost tells you by how much the profit margin of this variable would have to improve for it to be optimal to use that
variable. Here it is $24/unit.
H) Reduced Cost: if the Objective Coefficient of decision variable $D$8 improved to $159 [= 135 ( 24 )], then this variable would be
included in the optimal solution, i.e. its Final Value would be > 0. Also note that, you do not have a choice of decreasing its profit per
unit as it will not impact its inclusion in final solution. Note that a negative reduced cost means increasing of the value as minus of
minus is plus.
I) Shadow Price: measures by how much the optimal objective value (here, total profit) would change if the Constraint Right Hand Side
changed by one unit. For example, if the number of Fan Motors is increased to 201, then the profit will be increased to $16471 (=
16440 + 31). Similarly, if the number of Cooling coils available is decreased to 300, then the total profit would go down by 32x20 = 640.

Compiled by Prof. KVSSN Narasimha Murty

You might also like