You are on page 1of 7

1.

Cereal Data Factor Analysis

 Reading the data:


Cereal <- read.csv(file.choose())

 Exploring the data:


We will summarize the data, if it is as per our expectations.
summary(Cereal)

We notice that there are values of ‘6’ which is not expected, as the highest scale is of
‘5’. So, we will replace all the ‘6’ with ‘5’.

We will see the summary of the data again,


 Factor Analysis:

We will find out the number of factors that will be required for factor analysis. We
shall use Parallel Analysis and eigen values for the same.

C2 <- as.matrix(Cereal[, 2:26])


ev <- eigen(cor(C2))
ev
nS <- nScree(x=ev$values)
plotnScree(nS)
It indicates the number of factors should be = 5.

Let us try with Parallel Analysis;

numFactors <- fa.parallel(Cereal[,c(-1)], fm = 'minres', fa = 'fa')

Parallel Analysis indicate that number of factors=4 and the number of


components=NA
Let’s create a structure with a 4-factor model

#Goodness of fit:
#Tucker Lewis Index of factoring reliability = 0.89. >0.90 is acceptable.
So let us choose nfactors=5 and see the results,
fit2<-fa(cereal[,-1], nfactors=5, fm="ml", rotate="oblimin")
fit2

#Tucker Lewis Index of factoring reliability = 0.914. >0.90 is acceptable.


#RMSEA index = 0.059. <0.06 is excellent
#The root mean square of the residuals (RMSR) is 0.03. <0.06 is excellent.

Therefore from both the analysis of Eigen values and Parallel Analysis, the number of factor
indicated=5.
fa.diagram(fit2)

print(fit2$loadings,cutoff = 0.4)
As the alpha values are > 0.7, the factors are reliable

factor1 <- c(3,4,14,19,23,26)


factor2 <- c(5,7,16,20,22)
factor3 <- c(11,13,15)
factor4 <- c(10,12,17,18,21,24,25)
factor5 <- c(2,8,9)
factor1alpha <- psych::alpha(cereal[,factor1], check.keys = TRUE)
factor2alpha <- psych::alpha(cereal[,factor2], check.keys = TRUE)
factor3alpha <- psych::alpha(cereal[,factor3], check.keys = TRUE)
factor4alpha <- psych::alpha(cereal[,factor4], check.keys = TRUE)
factor5alpha <- psych::alpha(cereal[,factor4], check.keys = TRUE)

Therefore the factors might indicate:


# Factor1 = Health
#Natural,Fibre,Health,Regular,Quality,Nutritious

#Factor2 = Taste
# Sweet, Salt, Calories, Sugar, Process

#Factor3 = Family
# Kids, Economical, Family

#Factor4 = Texture
#Fun,Soggy,Plain,Crisp,Fruit,Treat,Boring

#Factor5 = Excitement
#Filling,Satisfying,Energy

Average Factor Scores grouped by the cereal

factor1alpha$total$raw_alpha
factor2alpha$total$raw_alpha
factor3alpha$total$raw_alpha
factor4alpha$total$raw_alpha
factor5alpha$total$raw_alpha

cereal$factor1Score <- apply(cereal[,factor1],1,mean)


cereal$factor2Score <- apply(cereal[,factor2],1,mean)
cereal$factor3Score <- apply(cereal[,factor3],1,mean)
cereal$factor4Score <- apply(cereal[,factor4],1,mean)
cereal$factor5Score <- apply(cereal[,factor5],1,mean)
colnames(cereal)[27:31] <-c("Health", "Taste", "Family", "Texture","Excitement")
aggregateCereal<-aggregate(cereal[,27:31], list(cereal[,1]), mean)
format(aggregateCereal, digits = 2)

You might also like