You are on page 1of 4

Assignment 4: Who busts the Mythbusters?

Raymond Guo
2020-02-12

Exercise 1
group is the explanatory variable and the yawn is the response variable. Yes is the value of the
response value to be classified as success.

Exercise 2
The average difference in yawns between the treatment and control groups.

Exercise 3

specify(yawn ~ group, success = "yes")

Exercise 4

hypothesize(null = "independece")

Exercise 5

generate(reps = 10000, type = "permute")

Exercise 6

calculate(stat = "diff in props", order = combine("Treatment", "Control"))

Exercise 7

yawn_null <- yawn %>%


specify(formula = yawn ~ group, success = "yes") %>%
hypothesize(null = "independence") %>%
generate(reps = 10000, type = "permute") %>%
calculate(stat = "diff in props", order = combine("Treatment", "Control"))

Exercise 8
i.
ggplot(data = yawn_null) +
geom_histogram(
mapping=aes(x = stat), binwidth = 0.05
)

1
2500

2000

count 1500

1000

500

0
−0.50 −0.25 0.00 0.25
stat

ii.
ggplot(data = yawn_null) +
geom_density(
mapping=aes(x = stat),
adjust = 0.05/bw.nrd0(yawn_null$stat)
)

2
density

0
−0.50 −0.25 0.00 0.25
stat

iii. In each distribution the center is positioned at stat - 0.00. Yes, it makes sense considering
that when we are randomizing we can have a simulation 1 with person A minus Person B and
simulation 2 with the same two people but in reverse, person B minus Person B. This creates
an equalibrum. Though the graph does not fully match it, but it is very close. ## Exercise 9
ggplot(data = yawn_null) +
geom_histogram(aes(x = stat), binwidth = 0.05) +
geom_density(aes(x = stat), adjust = 0.05/bw.nrd0(yawn_null$stat))

2
2500

2000

count 1500

1000

500

0
−0.50 −0.25 0.00 0.25
stat

This plot completely ignored the density plot and only displayed the histogram.

Exercise 10

ggplot(data = yawn_null) +
geom_histogram(aes(x = stat, y = ..density..), binwidth = 0.05) +
geom_density(aes(x = stat),
adjust = 0.05/bw.nrd0(yawn_null$stat))

3
density

0
−0.50 −0.25 0.00 0.25
stat

Exercise 11

yawn_obs_stat <- yawn %>%


specify(formula = yawn ~ group, success = "yes") %>%
calculate(stat = "diff in props", order = combine("Treatment", "Control"))

yawn_null %>%
get_p_value(obs_stat = yawn_obs_stat, direction = "right")

3
p_value
0.5143

yawn_null %>%
visualize() +
shade_p_value(obs_stat = yawn_obs_stat, direction = "right")

Simulation−Based Null Distribution


2500

2000

1500
count

1000

500

0
−0.50 −0.25 0.00 0.25
stat

The p value is greater than 0.5126 which is greater than the significance level. This is enough
justification to reject the null hypothesis.

You might also like