You are on page 1of 4

Exam 1 Part 3 - McKinney

Joseph L. McKinney
6/18/2019

mydata <- read.csv("baseball.csv")


#head(mydata)

Part 1.1 find the mean

FA_mean <- mean(mydata$cost)


FA_mean

## [1] 218.2917

Part 1.2 find the median

FA_median <- median(mydata$cost)


FA_median

## [1] 220.5

Part 1.3 find the standard deviation

FA_sd <- sd(mydata$cost)


FA_sd

## [1] 32.8825

Part 1.4 Calculate skewness & Kurtosis:

# hint: You can find skewness & Kurtosis using "e1071" R library. Although you need to know what is skew
# install.packages("e1071") # if you need to installation, you must uncomment this (delete the pound # s
library(e1071)

skewness(mydata$cost)

## [1] -0.1094914

kurtosis(mydata$cost)

## [1] -0.7752946

Part 2) Roll a fair die 1000 times by simulation estimate the mean and standard deviation

1
# Hint: you can use sample() function in R to create 10000 instances of random dice numbers.
set.seed(123)

n = 1000

X1 = sample(1:6, n, replace = T)
#Now you have 1000 instances of rolling a die

sumY <- cumsum(X1)

cumMean <- c()


for (i in 1:n) {
cumMean[i] <- sumY[i]/i
}
#plot(cumMean, type = "l")

Estimate_mean <- cumMean[n]


Estimate_mean

## [1] 3.457

Estimate standard deviation of the outcome by getting the standard deviation of the sample

sd_x = sd(X1)
sd_x

## [1] 1.71204

Part(3) Flip a biased coin 1,000 times by simulation in R (using x = rbinom(1000, 1, p=0.6)). X is the
number of heads we get.

x4 = rbinom(1000, 1, p=0.6)

a) Plot the cumulative mean of X.

n = 1000
cumxMean <- c()
sumX <- c()

sumX <- cumsum(x4)

for (i in 1:n) {

cumxMean[i] <- sumX[i]/i


}

plot(cumxMean, type = "l")

2
1.0
0.9
cumxMean

0.8
0.7
0.6
0.5

0 200 400 600 800 1000

Index

b) Change the simulation size to 100 and repeat the process.

x4 = rbinom(100, 1, p=0.6)

n = 100
cumxMean <- c()
sumX <- c()

sumX <- cumsum(x4)

for (i in 1:n) {

cumxMean[i] <- sumX[i]/i


}

plot(cumxMean, type = "l")

3
0.6
cumxMean

0.4
0.2
0.0

0 20 40 60 80 100

Index

c) What do you conclude after performing part a and b.

with a smaller sample size the cumulative frequency is not so clear it is moving alot
I conclude that a greater sample size allows for better convergence
LLN (Law of large numbers) can be illustrated as we look at the plot of the larger sample size as the
cumulative mean is merging to 0.6

You might also like