You are on page 1of 4

Business Report

Project: Advanced Statistics
Student’s Name –Tanaya Lokhande

Contents
1 ANOVA test
1.1 STATE THE NULL AND THE ALTERNATE HYPOTHESIS FOR CONDUCTING ONE
WAY ANOVA FOR BOTH EDUCATION AND OCCUPATION
INDIVIDUALLY . .......................................................................................................4
 
1.2 PERFORM ONE WAY ANOVA FOR EDUCATION WITH RESPECT TO THE VARIABL ‘SALARY ’. STATE
WHETHERTHE NULL HYPOTHESIS IS ACCEPTED OR REJECTED BASED ON THE ANOVA RESULTS
..................................................................... 5
 
1.3 PERFORM ONE WAY ANOVA FOR VARIABLE OCCUPATION WITH RESPECT TO THE VARIABLE ‘SALARY ’.
STATE WHETHER THE NULL HYPOTHESIS IS ACCEPTED OR REJECTED BASED ON
THE ANOVA RESULTS. ............................... 5
 
1.4 IF THE NULL HYPOTHESIS IS REJECTED IN EITHER (1.2) OR IN(1.3),FIND OUT WHICH CLASS MEANS
ARESIGNIFICANTLY DIFFERENT
.
 
I
NTERPRET THE RESULT
. ............................................................................................... 6
 
1.5
 
 W 
HAT IS THE INTERACTION BETWEEN THE TWO TREATMENTS
?
 
 A 
NALYZE THE EFFECTS OF ONE VARIABLE ONTHE OTHER
(E
DUCATION AND
O
CCUPATION
)
 WITH THE HELP OF AN INTERACTION PLOT
. ....................................... 7
 
1.6
 
P
ERFORM A TWO
-
 WAY
 ANOVA 
 BASED ON THE
E
DUCATION AND
O
CCUPATION
(
 ALONG WITH THEIRINTERACTION
E
DUCATION
*O
CCUPATION
)
 WITH THE VARIABLE
‘S
 ALARY 
’.
 
S
TATE THE NULL AND ALTERNATIVEHYPOTHESES AND STATE YOUR RESULTS
.
 
H
OW WILL YOU INTERPRET THIS RESULT
?
 
 A 
NSWER 
: ................................ 9
 
1.7
 
E
 XPLAIN THE BUSINESS IMPLICATIONS OF PERFORMING
 ANOVA 
 FOR THIS PARTICULAR CASE STUDY 
 ......... 10
 
2
 
EDA AND PCA ............................................................................................. 11
 
2.1
 
P
ERFORM
E
 XPLORATORY
D
 ATA
 A 
NALYSIS
[
BOTH UNIVARIATE AND MULTIVARIATE ANALYSIS TO BEPERFORMED
]. ......................................................................................................................................................... 11
 
2.2
 
I
S SCALING NECESSARY FOR
PCA 
 IN THIS CASE
?
 
G
IVE JUSTIFICATION AND PERFORM SCALING
. .................... 18
 
2.3
 
C
OMMENT ON THE COMPARISON BETWEEN THE COVARIANCE AND THE CORRELATION MATRICES FROM
THISDATA 
.[
ON SCALED DATA 
] ....................................................................................................................................... 20
 
2.4
 
C
HECK THE DATASET FOR OUTLIERS BEFORE AND AFTER SCALING
.
 
 W 
HAT INSIGHT DO YOU DERIVE HERE
? ..21
 
2.5
 
P
ERFORM
PCA 
 AND EXPORT THE DATA OF THE
P
RINCIPAL
C
OMPONENT
(
EIGENVECTORS
)
 INTO A DATAFRAME WITH THE ORIGINAL FEATURES
 ................................................................................................................... 23
 
2.6
 
 W 
RITE DOWN THE EXPLICIT FORM OF THE FIRST
PC
 
(
IN TERMS OF THE EIGENVECTORS
.
 
U
SE VALUES WITHTWO PLACES OF DECIMALS ONLY 
).
 
 W 
RITE THE LINEAR EQUATION OF
PC
 IN TERMS OF EIGENVECTORS ANDCORRESPONDING FEATURES
. .................................................................................................................................. 27
 
2.7
 
C
ONSIDER THE CUMULATIVE VALUES OF THE EIGENVALUES
.
 
H
OW DOES IT HELP YOU TO DECIDE ON THEOPTIMUM NUMBER OF PRINCIPAL COMPONENTS
?
 
 W 
HAT DO THE EIGENVECTORS INDICATE
? ................................. 28
 
2.8
 
E
 XPLAIN THE BUSINESS IMPLICATION OF USING THE
P
RINCIPAL
C
OMPONENT
 A 
NALYSIS FOR THIS CASESTUDY 
.
 
H
OW MAY
PC
S HELP IN THE FURTHER ANALYSIS
?
 
[H
INT
:
 
 W 
RITE
I
NTERPRETATIONS OF THE
P
RINCIPAL
C
OMPONENTS
O
BTAINED
] .................................................................................................................................... 29

You might also like