You are on page 1of 11

Post Graduate Diploma in Management

(2019-2021)

Data Analytics Tools and Techniques

Answer 3

Submitted To:
Professor Ramprrasadh Goarty

Submitted by:

Name Roll No. SAP ID


Abhishek Pandey A001 80203190120
Shital Ganeriwal A042 80203190059
Surbhi Mundra A048 80203190113

1
Business Requirement
A set of observations are given human behaviors. The aim is to find out the top factors for
the various behaviors given.
Steps followed in R

 Data is loaded.
 Removal of first seven columns (demographics).
 Checking whether NA is present or not.
 Scaling the data.
 Checking the correlation.
 Obtaining the screen plot.
 Building various factor models.
 Choosing the best model.
 Checking which variable is explained maximum by the best model.

RESULTS in R (Output and Inferences)

 Correlation

2
 Screen plot

Inference
It is observed that after Component 3 the steepness of the curve starts decreasing so we
would start with three – factor model.
The ideal model could be six – factor model as after that the curve is almost flattened but it
could be concluded only after running the models.

3
 Three – Factor Model

  MR1 MR2 MR3


SS loadings 5.94 4.76 3.79
Proportion Var 0.12 0.1 0.08
Cumulative Var 0.12 0.21 0.29

Inference
 Cumulative variance of this model is 29% i.e. the three-factor model explains 29% of
the variance of the original data.
 It is observed that in three factor model there are many features (O1-O9) which are
not loaded in anyone of the factors.
 A lot of features have cross loading highlighted in orange.
 Hence this model is not an optimal model

4
 Four – factor model

  MR1 MR2 MR3 MR4


SS loadings 5.36 5.03 3.85 3.20
Proportion Var 0.11 0.10 0.08 0.06
Cumulative Var 0.11 0.21 0.28 0.35

Inference
 Cumulative variance is increased 35% i.e. the four-factor model explains 35% of the
variance of the original data.
 It is observed that in four factor model also there are many features which are not
loaded in anyone of the factors.
 Cross loading is still there for two features.
 Hence this model is not an optimal model.

5
 Five – Factor Model

  MR1 MR2 MR3 MR5 MR4


SS loadings 4.986 4.611 3.766 3.272 3.173
Proportion Var 0.100 0.092 0.075 0.065 0.063
Cumulative Var 0.100 0.192 0.267 0.333 0.396

Inference
 Cumulative variance is increased 40% i.e. the five - factor model explains 40% of the
variance of the original data.
 It is observed that in five factor model there are only 3 (N4, A10, 09) features which
are not loaded in anyone of the factors.
 No cross loading.
 Hence this model can be accepted.
 We will check further for six factor model if we could get a better cumulative variance.

6
 Six – Factor Model

  MR1 MR2 MR3 MR5 MR4 MR6


SS loadings 5.01 4.60 3.77 3.29 3.20 1.00
Proportion Var 0.10 0.09 0.08 0.07 0.06 0.02
Cumulative Var 0.10 0.19 0.27 0.33 0.40 0.42

Inference
 It is observed that in six factor model there are 4 (N4, A3, C3, 09) features which are
not loaded in anyone of the factors.
 No cross loading.
 Cumulative variance is increased 42% i.e. the six-factor model explains 42% of the
variance of the original data.
 Here, we see that no features are loaded in MR6 which means it is redundant.
Hence for a 2% increase in Cumulative Variance we wont increase redundancy and
five factor model is chosen as the final model.

7
Final Model: Five - Factor Model with Cumulative Variance of 40%
Communality (h2) of all the features are given below in the table for five-factor model.

  MR1 MR2 MR3 MR5 MR4 h2 u2 com


A4 0.04 0.06 0.78 0.04 0.02 0.62 0.38 1
E5 0.73 -0.08 0.22 0.1 0.08 0.6 0.4 1.3
E3 0.65 -0.26 0.26 0.13 -0.01 0.57 0.43 1.7
E7 0.73 -0.1 0.16 0.05 0.03 0.57 0.43 1.2
N6 -0.06 0.74 0.03 -0.09 -0.09 0.57 0.43 1.1
N8 -0.02 0.73 -0.09 -0.16 -0.02 0.57 0.43 1.1
N9 -0.05 0.71 -0.18 -0.05 -0.03 0.54 0.46 1.1
E4 -0.7 0.15 -0.06 -0.02 -0.01 0.52 0.48 1.1
N7 -0.01 0.7 -0.08 -0.16 -0.01 0.52 0.48 1.1
A9 0.12 0.11 0.69 0.07 0.07 0.51 0.49 1.2
A7 -0.31 0.1 -0.63 -0.01 -0.05 0.5 0.5 1.5
N1 -0.11 0.69 0.06 -0.02 -0.07 0.49 0.51 1.1
E2 -0.68 0.01 -0.12 0.03 -0.04 0.48 0.52 1.1
O10 0.19 -0.02 0.03 0.05 0.66 0.48 0.52 1.2
N10 -0.25 0.61 -0.04 -0.17 0.05 0.47 0.53 1.5
E1 0.67 -0.06 0.06 0.01 0.04 0.46 0.54 1
E10 -0.64 0.19 -0.06 -0.02 -0.02 0.45 0.55 1.2
A5 -0.14 0.02 -0.66 0 -0.03 0.45 0.55 1.1
C4 -0.06 0.36 -0.05 -0.56 0.01 0.45 0.55 1.8
N3 -0.14 0.62 0.16 0.04 -0.01 0.43 0.57 1.3
O5 0.21 -0.06 -0.01 0.17 0.58 0.42 0.58 1.5
A2 0.34 -0.06 0.53 0 0.09 0.41 0.59 1.8
C5 0.09 -0.09 0.07 0.62 -0.08 0.41 0.59 1.1
E6 -0.56 0.09 -0.16 -0.03 -0.22 0.4 0.6 1.6
E9 0.62 -0.04 -0.01 -0.01 0.12 0.4 0.6 1.1
C9 0.06 0.02 0.09 0.63 -0.04 0.4 0.6 1.1
C1 0.04 -0.1 0.01 0.6 0.12 0.39 0.61 1.2
A6 0 0.15 0.59 0.04 -0.07 0.37 0.63 1.2
C6 0 0.17 0 -0.58 0.06 0.37 0.63 1.2
A8 0.13 -0.02 0.58 0.09 0.05 0.36 0.64 1.2
O1 0.04 -0.04 -0.03 0.05 0.59 0.36 0.64 1
O2 -0.02 0.21 -0.03 0 -0.56 0.36 0.64 1.3
E8 -0.56 0.04 0.03 0.06 -0.03 0.33 0.67 1
O8 0 0.08 -0.11 -0.05 0.55 0.33 0.67 1.1
N5 -0.05 0.54 -0.02 -0.11 -0.13 0.32 0.68 1.2
A10 0.34 -0.12 0.38 0.15 0.09 0.31 0.69 2.7
C2 0.05 0.11 0.04 -0.52 0.13 0.3 0.7 1.2
C7 -0.04 0.07 0.03 0.54 0.04 0.3 0.7 1.1
C8 -0.06 0.22 -0.16 -0.47 -0.04 0.3 0.7 1.7
O3 0.04 0.11 0.06 -0.08 0.53 0.3 0.7 1.2
O7 0.08 -0.13 0 0.2 0.49 0.3 0.7 1.5

8
C10 0.04 -0.01 0.06 0.47 0.23 0.28 0.72 1.5
A3 0.09 0.25 -0.4 -0.19 0.08 0.27 0.73 2.4
N2 0.11 -0.5 0.02 -0.03 0.07 0.26 0.74 1.2
O4 0.01 0.12 -0.11 0.06 -0.48 0.26 0.74 1.3
O6 -0.1 0.05 -0.08 0.03 -0.49 0.26 0.74 1.2
C3 -0.03 0.01 0.09 0.41 0.26 0.24 0.76 1.8
A1 -0.02 0.07 -0.43 -0.02 -0.09 0.2 0.8 1.1
O9 -0.14 0.17 0.18 0.05 0.34 0.2 0.8 2.5
N4 0.14 -0.32 -0.03 0.1 -0.05 0.14 0.86 1.7

Communality (h2)
It indicates how much a variable/feature has been explained by the underlying factor. Here
we see that A4 is explained maximum (62%) by the five-factor model among all other
features and N4 is explained minimum (14%) by this model.

Features loaded in different factors

Factors Features loaded


MR1 E1 - E10
MR2 N1 - N3 & N5 - N10
MR3 A1 - A9
MR5 C1 - C10
MR4 O1 - O8 & O10

9
Business Interpretation
AIM – To classify individuals based on their various characteristics or features.
Here, the characteristics can be grouped into five factors. The meaningful names are given
to the five factors according to the characteristics loaded in them.
Names of Different Factors
 MR1 can be named as Personality because E1 – E10 depicts the personality of a
human being whether he/she is introvert or extrovert, whether he/she is sociable or
not, etc.
 MR2 can be named as Mood Sensitivity because N1 – N3 & N5 – N10 states how
person reacts when he/she is disturbed or under stress, how his/her mood swings
and whether he is irritated or relaxed, etc.
 MR3 can be named as Emotional Intelligence because A1 – A9 deals with the
characteristics which shows the ability of a person to understand whether he/she is
interested in others or not, whether he/she sympathizes with people or insult people,
whether he/she is concerned about others or not, etc.
 MR5 can be named as Attitude and Beliefs because C1 – C10 reveals the attitude
of an individual towards his life for example whether he/she is organized or messy,
whether he/she believes in following a schedule or shirk the duties, etc.
 MR4 can be named as Cognitive Ability because O1 – O8 and O10 shows the
mental ability to understand things, have creativity, efficiency in words, etc.

Five Factors determined by the five-factor model

 Personality
 Mood Sensitivity
 Emotional Intelligence
 Attitude and Beliefs
 Cognitive Ability

Output of five – factor model

  MR1 MR2 MR3 MR5 MR4


SS loadings 4.99 4.61 3.77 3.27 3.17
Proportion Var 0.10 0.09 0.08 0.07 0.06
Cumulative Var 0.10 0.19 0.27 0.33 0.40
Proportion Explained 0.25 0.23 0.19 0.17 0.16
Cumulative Proportion 0.25 0.48 0.67 0.84 1.00

Inference

10
 The behavioural pattern of a human being can be determined by the top five factors
determined by this model.
 The top priority is given to Personality. It has the maximum sum of squared loadings.
All the factors have SS loadings greater than 1 which means they are relevant.
 Maximum proportion of the variance is explained by MR1 (Personality).
 Hypothesis testing reveals that 5 factors are sufficient.
 Hence, this 5 – factor model can be used by the Business to determine Human
Behaviour. Many recruitment processes can use this model and accordingly decide
whether that individual is fit for the job or not

11

You might also like