Yash Kuldeep Mehta Roll No: 248657
Practical No 3
Aim : Analysis of Variance.
1. Two random samples were drawn from two normal populations and their values are as follows
A B
66 64
67 66
75 74
76 78
82 82
84 85
88 87
90 92
92 93
95
97
Null Hypothesis: The variances are equal. H0 : σ 2A = σ 2B
Alternative Hypothesis: The variances are not equal. H0 : σ 2A ≠ σ 2B
Code:
pops <- read.csv(file.choose(),header=T)
var.test(pops$A,pops$B)
Output:
Conclusion:
We fail to reject null hypothesis. Therefore, the variances of both the populations are equal.
Mulund College of Commerce Data Science
Yash Kuldeep Mehta Roll No: 248657
2. Perform F-test for the given data.
time_g1 time_g2
85 83
95 85
105 96
85 94
90 102
97 100
104 94
95 95
88 88
90 92
94 95
95 94
95
90
Null Hypothesis: The variances are equal. H0 : σ 2time_g1 = σ 2time_g2
Alternative Hypothesis: The variances are not equal. H0 : σ 2time_g1 ≠ σ 2time_g2
Code:
pops <- read.csv(file.choose(),header=T)
var.test(pops$time_g1,pops$time_g2)
Output:
Conclusion:
We fail to reject null hypothesis. Therefore, both the variances are equal.
Mulund College of Commerce Data Science
Yash Kuldeep Mehta Roll No: 248657
3. A researcher studied education in the united kingdom and germany wanted to
compare how many years of on average women in each country spent in school.
The Researched obtained the random sample from both the countries.
Test whether the average number of years spent in the school by women in 2
countries are equal or not.
United kingdom Germany
12.8 10.8
12.6 10.9
13.1 11.2
13.2 11.3
13.6 11.4
12.1 10.6
13.5 10.7
14 10.9
14.2 10.8
12.2 10.9
Null Hypothesis: The variances are equal. H0 : σ 2(United Kingdom) = σ 2(Germany)
Alternative Hypothesis: The variances are not equal. H0 : σ 2(United Kingdom) ≠ σ 2(Germany)
Code:
pops <- read.csv(file.choose(),header=T)
var.test(pops$UK,pops$Germany)
Output:
Conclusion:
The null hypothesis is rejected. Therefore, both the variances are different.
4. A large company is assessing the difference in ‘satisfaction index’ of employees in finance, marketing &
client-servicing dept. Do the variance testing
( ANOVA) for the same.
Mulund College of Commerce Data Science
Yash Kuldeep Mehta Roll No: 248657
FINANCE MARKETING CS
75 63 72
56 53 69
72 74 77
59 77 71
62 69 59
66 57 70
67 70 67
71 68 73
59 51 74
62 64 60
66 55 62
58 65
76
Null Hypothesis : The means of the satisfaction index (SI) are equal across the different departments.
H0 : µ(Finance) = µ(Marketing) = µ(CS)
Alternative Hypothesis : The means of the satisfaction index (SI) are different across the different
departments. H0 : µ(Finance) ≠ µ(Marketing) ≠ µ(CS)
Code:
Output:
Mulund College of Commerce Data Science
Yash Kuldeep Mehta Roll No: 248657
Conclusion:
We fail to reject the null hypothesis. Therefore, the satisfaction index is equal across different departments.
5. A large company is assessing the difference in ‘satisfaction index’ of employees in finance, marketing &
client-servicing dept. Experience Level is also considered as one factor in study.
(lt5 - less than 5,gt5 - greater than 5). Do the variance testing ( ANOVA) for the same
Satindex Dept Experience Satindex Dept Experience
75 FINANCE lt5 64 MARKETING gt5
56 FINANCE lt5 55 MARKETING gt5
72 FINANCE lt5 72 CS lt5
66 FINANCE gt5 69 CS lt5
58 FINANCE gt5 60 CS gt5
58 MARKETING lt5 62 CS gt5
63 MARKETING lt5 65 CS gt5
53 MARKETING lt5
• Load the data and visualize
• Model building
• Model testing
• Inference/Prediction
Null Hypothesis: The mean satisfaction index is the same across all departments. The mean satisfaction
index is the same across both experience levels .
Alternative Hypothesis : The mean satisfaction index is not same across all departments. The mean
satisfaction index is not same across both experience levels .
Code:
data1 <- read.csv(file.choose(), header = TRUE)
summary(data1)
Mulund College of Commerce Data Science
Yash Kuldeep Mehta Roll No: 248657
anv <- aov(SI ~ Dept + Exp + Dept*Exp, data = data1)
summary(anv)
boxplot(SI ~ Dept * Exp, data = data1,
main = "Satisfaction Index by Department and Experience Level",
xlab = "Department and Experience Level",
ylab = "Satisfaction Index",
col = c("lightblue", "lightgreen", "lightcoral"),
border = "black")
Output:
Conclusion:
We fail to reject the null hypothesis. Therefore, the satisfaction index is same across all the departments
and both the experience levels.
Mulund College of Commerce Data Science
Yash Kuldeep Mehta Roll No: 248657
6. In a research it was state that interest in politics was influence by the level of education and gender. Test
the hypothesis(ANOVA) for given sample
Gen Educati Political Gen Educati Political
der on _Int der on _Int
F School 58 M School 78
M Colleg 60 M Colleg 73
e e
M Univer 65 M Univer 65
sity sity
F PHD 50 F PHD 60
F School 75 M PHD 70
F School 70 F School 60
M Colleg 65 M Colleg 65
e e
Null Hypothesis: There is no significant difference in the mean political interest among different gender
groups, different education levels.
Alternative Hypothesis: There is a significant difference in the mean political interest among different
gender groups, different education levels.
Code:
data1 <- read.csv(file.choose(), header = TRUE)
summary(data1)
anv <- aov(Polint~Gender+Education+Gender*Education, data = data1)
summary(anv)
boxplot(Polint ~ Gender * Education, data = data1,
main = "Boxplot of Political Interest by Gender and Education",
xlab = "Gender and Education Level",
ylab = "Political Interest",
col = c("lightblue", "lightgreen", "lightpink", "lightyellow"),
border="black")
Output:
Mulund College of Commerce Data Science
Yash Kuldeep Mehta Roll No: 248657
Mulund College of Commerce Data Science