Professional Documents
Culture Documents
Institute of Rural Management Anand: PGDM-RM41 - Term III - End Term Examination
Institute of Rural Management Anand: PGDM-RM41 - Term III - End Term Examination
ANS 1-
A) > getwd()
[1] "C:/Users/nidhi/Documents"
> setwd("C:/Users/nidhi/Downloads")
>library(cluster)
>library(factoextra)
>library(magrittr)
>food=read.csv("food-texture.csv",header = TRUE)
> food=food[,-1]
>foodscale=scale(food)
##K-means clustering
#PCA
apply(food,2,mean)
apply(food, 2, var)
pr.out=prcomp(food , scale=TRUE)
pr.out$center
pr.out$scale
pr.out$rotation
pr.var=(pr.out$sdev)^2
Page 1 of 13
Institute of Rural Management Anand
PGDM-RM41 – Term III – End Term Examination
< Business Analytics >
< 7 APRIL 2021 >
< Nidhi Tulsyan, P41088 >
pve=pr.var/sum(pr.var)
pve
fviz_nbclust(food,kmeans,method="silhouette")
Page 2 of 13
Institute of Rural Management Anand
PGDM-RM41 – Term III – End Term Examination
< Business Analytics >
< 7 APRIL 2021 >
< Nidhi Tulsyan, P41088 >
Through this method we came to the conclusion that there kink at 2 so optimal number of
clusters are 2 while when we used silhouette, the maximum distance was found for 2 hence the optimal
number of clusters are 2.
B- Hierarchical
> res.hc=hclust(dist(vish),method="complete")
Page 3 of 13
Institute of Rural Management Anand
PGDM-RM41 – Term III – End Term Examination
< Business Analytics >
< 7 APRIL 2021 >
< Nidhi Tulsyan, P41088 >
C- there are 2 clusters where cluster 1 is of size 16 while cluster is of size 34.
Mean of oil 16.51 in cluster 1 while cluster 2 has 18.65 which shows that cluster 2 has more oil than cluster
1. And people are willing to pay price for cluster 1.
Pastries in cluster 1 are more of density, fracture, hardness while cluster 2 pastries are hard and is less
dense and crisper.
D-
ANS 2-
Page 4 of 13
Institute of Rural Management Anand
PGDM-RM41 – Term III – End Term Examination
< Business Analytics >
< 7 APRIL 2021 >
< Nidhi Tulsyan, P41088 >
A)- The scenario can be modelled using Markov Chain because the driver in the given zone has
only two options either to stay back or to move to the next zone. The probability of driving going
from one state to another is dependent upon its current state and no its previous states. This model
has the space state as north zone, south zone, and west zone and as per stochastic process the
movement evolves over the time and this condition is known as Markov Chain.
B)- The different states are North zone, South Zone and west. As the movement of driver is being
specified by these zones.
C)-
install.packages("Markovchain")
library(markovchain)
> tran_mat=matrix(c(0.3,0.3,0.4,0.4,0.4,0.2,0.5,0.3,0.2),nrow=3,byrow=TRUE)
> tran_mat
[,1] [,2] [,3]
[1,] 0.3 0.3 0.4
[2,] 0.4 0.4 0.2
[3,] 0.5 0.3 0.2
>disp_trans=new("markovchain",transitionMatrix=tran_mat,states=c("North","South","West"),nam
e="DriverMovement")
> disp_trans
DriverMovement
A 3 - dimensional discrete Markov Chain defined by the following states:
North, South, West
The transition matrix (by rows) is defined as follows:
North South West
North 0.3 0.3 0.4
South 0.4 0.4 0.2
West 0.5 0.3 0.2
Page 5 of 13
Institute of Rural Management Anand
PGDM-RM41 – Term III – End Term Examination
< Business Analytics >
< 7 APRIL 2021 >
< Nidhi Tulsyan, P41088 >
D)-
>steadyStates(disp_trans)
The driver has a probability of 39% of reaching in the north zone, a probability of 33% that
he will be in south zone in steady state and approximately probability of 28% that he will reach
west zone in steady state.
E)-
> current_state=c(0.2,0.45,0.35)
> current_state*disp_trans^2
Page 6 of 13
Institute of Rural Management Anand
PGDM-RM41 – Term III – End Term Examination
< Business Analytics >
< 7 APRIL 2021 >
< Nidhi Tulsyan, P41088 >
The percentage of the drivers has a probability in each of these zones after the next trip that
means after 2 transitions are
North= 38.25%
South= 33.45%
West= 28.3%
ANS3- A)
Residuals:
Coefficients:
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
INTERPRETATION- price is considered as independent variable. Here the price has good
influence over variable sales as there is 95% CI significant with a p-value less than 0.001, but R
squared value is 0.198 which means that the model is 19.8% explain the cause of variation. Also
the value of adjusted R squared is 0.196 both the values are very low. So we can say that model is
Page 7 of 13
Institute of Rural Management Anand
PGDM-RM41 – Term III – End Term Examination
< Business Analytics >
< 7 APRIL 2021 >
< Nidhi Tulsyan, P41088 >
significant though it is not good model because it is not able to explain accurately and has low
value of R squared.
B)-
model2=lm(data=Carseats, Sales ~.)
summary(model2)
Call: lm(formula = Sales ~ ., data = Carseats)
Residuals:
Min 1Q Median 3Q Max
-2.8692 -0.6908 0.0211 0.6636 3.4115
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.6606231 0.6034487 9.380 < 0.0000000000000002 *
CompPrice 0.0928153 0.0041477 22.378 < 0.0000000000000002 *
Income 0.0158028 0.0018451 8.565 0.000000000000000258 *
Advertising 0.1230951 0.0111237 11.066 < 0.0000000000000002 *
Population 0.0002079 0.0003705 0.561 0.575
Price -0.0953579 0.0026711 -35.700 < 0.0000000000000002 *
ShelveLocGood 4.8501827 0.1531100 31.678 < 0.0000000000000002 *
ShelveLocMedium 1.9567148 0.1261056 15.516 < 0.0000000000000002 *
Age -0.0460452 0.0031817 -14.472 < 0.0000000000000002 *
Education -0.0211018 0.0197205 -1.070 0.285
UrbanYes 0.1228864 0.1129761 1.088 0.277
USYes -0.1840928 0.1498423 -1.229 0.220 ---
Signif. codes: 0 ‘*’ 0.001 ‘*’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
Page 8 of 13
Institute of Rural Management Anand
PGDM-RM41 – Term III – End Term Examination
< Business Analytics >
< 7 APRIL 2021 >
< Nidhi Tulsyan, P41088 >
Residuals:
Min 1Q Median 3Q Max
-2.7728 -0.6954 0.0282 0.6732 3.3292
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.475226 0.505005 10.84 <0.0000000000000002 *
CompPrice 0.092571 0.004123 22.45 <0.0000000000000002 *
Income 0.015785 0.001838 8.59 <0.0000000000000002 *
Advertising 0.115903 0.007724 15.01 <0.0000000000000002 *
Price -0.095319 0.002670 -35.70 <0.0000000000000002 *
ShelveLocGood 4.835675 0.152499 31.71 <0.0000000000000002 *
ShelveLocMedium 1.951993 0.125375 15.57 <0.0000000000000002 *
Age -0.046128 0.003177 -14.52 <0.0000000000000002 *
---
Signif. codes: 0 ‘*’ 0.001 ‘*’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
Page 9 of 13
Institute of Rural Management Anand
PGDM-RM41 – Term III – End Term Examination
< Business Analytics >
< 7 APRIL 2021 >
< Nidhi Tulsyan, P41088 >
Previously R square value 87.34 % has been obtained which means this model can explain 87.34%
cause of variation.
With this new model we have got R square 87.2 percent which is similar and all the variables are
significant with a p value 0.001. so, this model is better the an the previous model.
C)-
We have tests correlation of predictor variables to understand whether there is a high correlation or
not among the predictor variables or not and which ultimately led us to predicting synergy among
variables.
That means whether both the (variable1*variable2) influencing the model or not.
cor(subset(Carseats, select=-c(ShelveLoc,Urban,US)))
4495073
8484777
5669820
4453687
1214362
0000000
0217684
1174660
Page 10 of 13
Institute of Rural Management Anand
PGDM-RM41 – Term III – End Term Examination
< Business Analytics >
< 7 APRIL 2021 >
< Nidhi Tulsyan, P41088 >
Age Education
HERE WE CAN OBSERVE THAT THAT THE CORRELATION BETWEEN VARIABLES ARE NOT
EVEN EXCEEDING + OR - 0.5 SO WE CAN CONCLUDE THAT THERE IS NO SUCH SYNERGY
BETWEEN TWO VARIABLES.
THIS RESIDUAL PLUS FITTED PLOT IS SHOWING THAT THE MODEL IS HOMOSCEDASTIC
AND THERE ARE ONLY TWO OUTLIER IN THE DATASET WHICH ARE 208 AND 358.
Page 11 of 13
Institute of Rural Management Anand
PGDM-RM41 – Term III – End Term Examination
< Business Analytics >
< 7 APRIL 2021 >
< Nidhi Tulsyan, P41088 >
Page 12 of 13
Institute of Rural Management Anand
PGDM-RM41 – Term III – End Term Examination
< Business Analytics >
< 7 APRIL 2021 >
< Nidhi Tulsyan, P41088 >
THERE IS NO SIGNIFICANT OUTLIER AT THE EXTREME OF THE RED DOTTED LINE OF THIS
LEVERAGE PLOT WHICH CAN INFLUENCE THE MODEL AND PREDICTED VARIABLES.
Page 13 of 13