You are on page 1of 7

Homework

Student Name
2023-01-25
Question No 1: R Question
Question No 1 (A): By using “pchisq” function, we can find p-value when the critical value and the
degree of freedom is given.
pchisq(24.67,5,ncp=0,lower.tail =FALSE)

## [1] 0.0001613338

Question No 1 (B)
Null and Alternative Hypothesis
H0: Grades and goals are independent
H1: Grades and goals are dependent
A = matrix(
c(63, 31,25, 88, 55, 33, 96, 55, 32),
nrow = 3,
ncol = 3,

byrow = TRUE
)
rownames(A) = c("4th", "5th", "6th")
colnames(A) = c("Grades", "Popular", "Sports")
cat("Cross tabulation of Grades and Goals:\n")

## Cross tabulation of Grades and Goals:

print(A)

## Grades Popular Sports


## 4th 63 31 25
## 5th 88 55 33
## 6th 96 55 32

library(gmodels)
CrossTable(A,digits = 2,expected = TRUE,prop.r = FALSE, prop.c = TRUE, prop.chisq =
TRUE,sresid = TRUE, format = c("SPSS"),dnn = c("Goals","Grades"))
##
## Cell Contents
## |-------------------------|
## | Count |
## | Expected Values |
## | Chi-square contribution |
## | Column Percent |
## | Total Percent |
## | Std Residual |
## |-------------------------|
##
## Total Observations in Table: 478
##
## | Grades
## Goals | Grades | Popular | Sports | Row Total |
## -------------|-----------|-----------|-----------|-----------|
## 4th | 63 | 31 | 25 | 119 |
## | 61.49 | 35.10 | 22.41 | |
## | 0.04 | 0.48 | 0.30 | |
## | 25.51% | 21.99% | 27.78% | |
## | 13.18% | 6.49% | 5.23% | |
## | 0.19 | -0.69 | 0.55 | |
## -------------|-----------|-----------|-----------|-----------|
## 5th | 88 | 55 | 33 | 176 |
## | 90.95 | 51.92 | 33.14 | |
## | 0.10 | 0.18 | 0.00 | |
## | 35.63% | 39.01% | 36.67% | |
## | 18.41% | 11.51% | 6.90% | |
## | -0.31 | 0.43 | -0.02 | |
## -------------|-----------|-----------|-----------|-----------|
## 6th | 96 | 55 | 32 | 183 |
## | 94.56 | 53.98 | 34.46 | |
## | 0.02 | 0.02 | 0.18 | |
## | 38.87% | 39.01% | 35.56% | |
## | 20.08% | 11.51% | 6.69% | |
## | 0.15 | 0.14 | -0.42 | |
## -------------|-----------|-----------|-----------|-----------|
## Column Total | 247 | 141 | 90 | 478 |
## | 51.67% | 29.50% | 18.83% | |
## -------------|-----------|-----------|-----------|-----------|
##
##
## Statistics for All Table Factors
##
##
## Pearson's Chi-squared test
## ------------------------------------------------------------
## Chi^2 = 1.312105 d.f. = 4 p = 0.8593185
##
##
##
## Minimum expected frequency: 22.40586
barplot(A,beside = T,legend=TRUE,col=c(2,4,6))

Fig 1: Multiple Bar chart of Grades and Goals


Question No 1(c): Part-I
Population generation of 0’s and 1’s
set.seed(100)
x1<-replicate(10000,0)
x2<-replicate(15000,1)
population <-c(x1,x2)
Sample_data=sample(population,100,replace=TRUE)
table(Sample_data)

## Sample_data
## 0 1
## 36 64

80% confidence interval


prop.test(36,100,alternative = "two.sided",conf.level = 0.8)

##
## 1-sample proportions test with continuity correction
##
## data: 36 out of 100, null probability 0.5
## X-squared = 7.29, df = 1, p-value = 0.006934
## alternative hypothesis: true p is not equal to 0.5
## 80 percent confidence interval:
## 0.2964714 0.4284175
## sample estimates:
## p
## 0.36
We are 80% confident that the true proportion of 0’s falls in the following range
0.2965 ≤ P≤ 0.4284
Question No 1 C: Part-II Sample size
We know that the sample size can be obtained as

( )
2
zα / 2
n= P(1− p)
e

( )
2
1.282 (
n= 0.36 ×0.64 )=946.666
0.02
n=947
Question No 6.32 (True False)
A) True: As degree of freedom increase mean also increase.
B) False: In this case we will fail to reject the true null hypothesis because the critical is greater than
our calculated value and falls in the acceptance region.
C) False: The chi square test is one tailed test and we shade only one side of the distribution
d) TRUE: As the degree of freedom increase the distribution approaches normality having moderate
variation distribution.
Question No 6.36
B = matrix(
c(264, 299,351, 38, 55, 77, 16, 15, 22),
nrow = 3,
ncol = 3,

byrow = TRUE
)
rownames(B) = c("should", "Should No", "No Answer")
colnames(B) = c("Replican", "Democratic", "Independent")
cat("Cross tabulation of Grades and Goals:\n")

## Cross tabulation of Grades and Goals:

print(B)

## Replican Democratic Independent


## should 264 299 351
## Should No 38 55 77
## No Answer 16 15 22

library(gmodels)
CrossTable(B,digits = 2,expected = TRUE,prop.r = FALSE, prop.c = TRUE, prop.chisq =
TRUE,sresid = TRUE, format = c("SPSS"),dnn = c("Answer","Party Affiliation"))
##
## Cell Contents
## |-------------------------|
## | Count |
## | Expected Values |
## | Chi-square contribution |
## | Column Percent |
## | Total Percent |
## | Std Residual |
## |-------------------------|
##
## Total Observations in Table: 1137
##
## | Party Affiliation
## Answer | Replicon | Democratic | Independent | Row Total |
## -------------|-------------|-------------|-------------|-------------|
## should | 264 | 299 | 351 | 914 |
## | 255.63 | 296.63 | 361.74 | |
## | 0.27 | 0.02 | 0.32 | |
## | 83.02% | 81.03% | 78.00% | |
## | 23.22% | 26.30% | 30.87% | |
## | 0.52 | 0.14 | -0.56 | |
## -------------|-------------|-------------|-------------|-------------|
## Should No | 38 | 55 | 77 | 170 |
## | 47.55 | 55.17 | 67.28 | |
## | 1.92 | 0.00 | 1.40 | |
## | 11.95% | 14.91% | 17.11% | |
## | 3.34% | 4.84% | 6.77% | |
## | -1.38 | -0.02 | 1.18 | |
## -------------|-------------|-------------|-------------|-------------|
## No Answer | 16 | 15 | 22 | 53 |
## | 14.82 | 17.20 | 20.98 | |
## | 0.09 | 0.28 | 0.05 | |
## | 5.03% | 4.07% | 4.89% | |
## | 1.41% | 1.32% | 1.93% | |
## | 0.31 | -0.53 | 0.22 | |
## -------------|-------------|-------------|-------------|-------------|
## Column Total | 318 | 369 | 450 | 1137 |
## | 27.97% | 32.45% | 39.58% | |
## -------------|-------------|-------------|-------------|-------------|
##
##
## Statistics for All Table Factors
##
##
## Pearson's Chi-squared test
## ------------------------------------------------------------
## Chi^2 = 4.357566 d.f. = 4 p = 0.3597721
##
##
##
## Minimum expected frequency: 14.82322
barplot(B,beside = T,legend=TRUE,col=c(2,4,6))

a) There are 296.63 or 297 Republican support to use full body scan.
b) There are 255.63 or 256 Democrat support to use full body scan.
c) There are 20.98 or 21 independent did not answer.
Question No 6.38
a) Null and alternative hypothesis
H0: Lymphatic filariasis disease and the treatment are independent
H1: Lymphatic filariasis disease and the treatment are associated.
b) Result
With the p-value approach, we have sufficient evidence to reject the true null hypothesis at a 5% level of
significance because the p-value <0.05, hence it is concluded that disease recover by using the treatment
or taking drugs. In this case both are associated or dependent.
Question No 6.46 (Diabetes and Unemployment)
c = matrix(
c(717,147,47057, 5708),
nrow = 2,
ncol = 2,

byrow = TRUE
)
rownames(c) = c("Diabetes", "Not Diabetes")
colnames(c) = c("Employment", "Unemployment")
cat("Cross tabulation of Employment and Disease:\n")

## Cross tabulation of Employment and Disease:


Question No 6.46 (Cross Tabulation two way table)
print(c)

## Employment Unemployment
## Diabetes 717 147
## Not Diabetes 47057 5708

Question No 3.46 (B): Difference of proportion of diabetes between employees and unemployed.
H 0 : P1−P2=0
H 1: P 1−P 2 ≠ 0
prop.test(x=c(717,147),n=c(47774,5855),alternative = "two.sided")

##
## 2-sample test for equality of proportions with continuity correction
##
## data: c(717, 147) out of c(47774, 5855)
## X-squared = 32.923, df = 1, p-value = 9.59e-09
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## -0.014347471 -0.005849695
## sample estimates:
## prop 1 prop 2
## 0.01500816 0.02510675

Question No 6.46 (C): With the p-value approach, we have sufficient evidence to reject the true null
hypothesis and accept the alternative which means the proportions are statistically significant between
employed and unemployed groups.

You might also like