Part 2 of my assignment to Coursera John Hopkins University Statistical inference.

© All Rights Reserved

41 views

Part 2 of my assignment to Coursera John Hopkins University Statistical inference.

© All Rights Reserved

- 18614-59763-1-SM
- HW9
- Tutorial 1(Stat)
- The Consumer Decision Making Styles of Mobile Phones among the University Level Students in Jordan
- The Effects of Using Videos on Teaching Selected Topics in Physics Towards the Development of Higher-Order Thinking Skills
- Untitled
- 35..
- Calculator Tips for Chapter 10
- REPORT ON EFFECTIVENESS OF 24/7 POLICY IN ALLIANCE UNIVERSITY
- Risk Tolerance
- R Samplesize
- Clase 8
- ch05
- The Efficiency Examination of Teaching of Different Normalization Methods
- A Study on Gender Differences
- STATA intro
- STAB22_FinalExam_2013F.pdf
- ch08
- KNOW THE PREFERENCE OF CONSUMER ON HERO-HONDA TWO-WHEELERSort
- final paper

You are on page 1of 6

November 7, 2016

Load necessary library

library(ggplot2)

library(datasets)

library(dplyr)

##

## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':

##

##

filter, lag

## The following objects are masked from 'package:base':

##

##

intersect, setdiff, setequal, union

This is part 2 of the Statistical Inference Course project. We will show the basic of inferential data analysis.

We will be analyzing TootGrowth data in the R datasets package.

Load the ToothGrowth data and perform some basic exploratory data analyses

data("ToothGrowth")

We will perform some basic exploratory data analyses such as plotting the observations and the dimension of

the data.

str(ToothGrowth)

## 'data.frame':

60 obs. of 3 variables:

## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...

## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...

## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...

ggplot(ToothGrowth, aes(x = supp, y = len)) + geom_boxplot() +

labs(title="Boxplot of Tooth Length by Supplement Type",x="supplement type", y="tooth length")

tooth length

30

20

10

OJ

VC

supplement type

We see that the data has 60 observations with 3 variables, namely the len or length of the tooth, the supp or

supplement type and dose or the dose in mg/day. Now looking on the scatterplot, the median of tooth length

in vitamin C is lower compared to orange juice but its more variable.

We observed also that dose is a numeric class and since we want to compare the length per supp and dose

also, we need to convert the dose class to factor.

ToothGrowth$dose<-as.factor(ToothGrowth$dose)

ggplot(ToothGrowth, aes(x = dose, y = len)) + geom_boxplot() +

labs(title="Boxplot of Tooth Length by Dose Type",x="dose type", y="tooth length")

tooth length

30

20

10

0.5

dose type

We see in the boxplot that their is difference between dose, as dose increases the tooth length also increases.

We can use summary function in R to find the basic statistics per supp and dose on tooth length.

summary(ToothGrowth)

##

##

##

##

##

##

##

len

Min.

: 4.20

1st Qu.:13.07

Median :19.25

Mean

:18.81

3rd Qu.:25.27

Max.

:33.90

supp

OJ:30

VC:30

dose

0.5:20

1 :20

2 :20

In general, the length of the tooth has the mean of 18.81, with range of 4.20 to 33.90. Our supp variable is a

two level factor while dose is a 3 level factor. We will use again the summary function to summarize the

length per group (similar to the boxplot results above)

tapply(ToothGrowth$len, ToothGrowth$supp, summary)

## $OJ

##

Min. 1st Qu.

Median

Max.

3

##

8.20

15.52

##

## $VC

##

Min. 1st Qu.

##

4.20

11.20

22.70

Median

16.50

20.66

25.72

30.90

16.96

23.10

Max.

33.90

## $`0.5`

##

Min. 1st Qu.

##

4.200

7.225

##

## $`1`

##

Min. 1st Qu.

##

13.60

16.25

##

## $`2`

##

Min. 1st Qu.

##

18.50

23.52

Median

9.850

10.600 12.250

Max.

21.500

Median

19.25

19.74

23.38

Max.

27.30

Median

25.95

26.10

27.83

Max.

33.90

tapply(ToothGrowth$len, ToothGrowth$supp, var)

##

OJ

VC

## 43.63344 68.32723

tapply(ToothGrowth$len, ToothGrowth$dose, var)

##

0.5

1

2

## 20.24787 19.49608 14.24421

We see that in supp, VC is more variable compared to OJ while in dose, 0.5 and 1.0 has very small difference

compared to 2.0 dose which has least variability.

Now to confirm that supp and dose has effect on the length of the tooth, we will use t test (we can use anova,

but since this project instruction is to use the test that had been discussed.

and dose.

We begin with testing the difference of the mean tooth length by supp.

t.test(ToothGrowth$len~ToothGrowth$supp, alternative="two.sided", var.equal=FALSE)

##

## Welch Two Sample t-test

##

## data: ToothGrowth$len by ToothGrowth$supp

## t = 1.9153, df = 55.309, p-value = 0.06063

4

##

##

##

##

##

##

95 percent confidence interval:

-0.1710156 7.5710156

sample estimates:

mean in group OJ mean in group VC

20.66333

16.96333

Now lets conduct t.test to different dose pair, (0.5,1.0),(0.5,2.0) and (1.0, 2.0).

## first pair (0.5,1.0)

t.test(subset(ToothGrowth, dose==0.5)$len,subset(ToothGrowth, dose==1.0)$len,

alternative="two.sided", var.equal=FALSE)

##

##

##

##

##

##

##

##

##

##

##

data: subset(ToothGrowth, dose == 0.5)$len and subset(ToothGrowth, dose == 1)$len

t = -6.4766, df = 37.986, p-value = 1.268e-07

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-11.983781 -6.276219

sample estimates:

mean of x mean of y

10.605

19.735

## second pair

t.test(subset(ToothGrowth, dose==0.5)$len,subset(ToothGrowth, dose==2.0)$len,

alternative="two.sided", var.equal=FALSE)

##

##

##

##

##

##

##

##

##

##

##

data: subset(ToothGrowth, dose == 0.5)$len and subset(ToothGrowth, dose == 2)$len

t = -11.799, df = 36.883, p-value = 4.398e-14

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-18.15617 -12.83383

sample estimates:

mean of x mean of y

10.605

26.100

## third pair

t.test(subset(ToothGrowth, dose==1.0)$len,subset(ToothGrowth, dose==2.0)$len,

alternative="two.sided", var.equal=FALSE)

##

##

##

##

##

##

##

data: subset(ToothGrowth, dose == 1)$len and subset(ToothGrowth, dose == 2)$len

t = -4.9005, df = 37.101, p-value = 1.906e-05

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

5

## -8.996481 -3.733519

## sample estimates:

## mean of x mean of y

##

19.735

26.100

For dose, we will use Bonferroni Correction since we have more than 1 test. We will reject the null hypothesis

that there is no difference in mean between dose if p-value is less than alpha/m test or (we use the conventional

level of significance, alpha = 0.05) or 0.0166667.

In using the t.test function (since n is small), we assume that variance between group is not equal using two

sided test, our alpha is 0.05. 1. Between supplement type, OJ and VC, p value is 0.06 which is greater than

our alpha 0.05, we fail to reject null that mean of OJ is equal to group VC. 2. In dose, we reject the null

hypotheses since all the p-value are less than to alpha/m or 0.0166667 and conclude that the 3 level dose is

significantly different to each other.

- 18614-59763-1-SMUploaded byBaidahaidha Idham Soge
- HW9Uploaded byLiyip Koh
- Tutorial 1(Stat)Uploaded bySanly_Lee_5727
- The Consumer Decision Making Styles of Mobile Phones among the University Level Students in JordanUploaded byxaxif8265
- The Effects of Using Videos on Teaching Selected Topics in Physics Towards the Development of Higher-Order Thinking SkillsUploaded byAsia Pacific Journal of Multidisciplinary Research
- UntitledUploaded byAnNa Nurjannah Anwar
- 35..Uploaded byMayank Paneliya
- Calculator Tips for Chapter 10Uploaded bysigiris
- REPORT ON EFFECTIVENESS OF 24/7 POLICY IN ALLIANCE UNIVERSITYUploaded bySharad Anand
- Risk ToleranceUploaded byAshok Venkata
- R SamplesizeUploaded byCharls Medith Labarda
- Clase 8Uploaded byBianchi Salgado
- ch05Uploaded by王兆慶
- The Efficiency Examination of Teaching of Different Normalization MethodsUploaded byMaurice Lee
- A Study on Gender DifferencesUploaded byDipannita Roy
- STATA introUploaded byappnu2dwild
- STAB22_FinalExam_2013F.pdfUploaded byexamkiller
- ch08Uploaded byXiaoxu Wu
- KNOW THE PREFERENCE OF CONSUMER ON HERO-HONDA TWO-WHEELERSortUploaded byakucool143
- final paperUploaded byapi-280035337
- 83615-202401-1-PBUploaded byNeen Naaz
- Chapter13 NewUploaded byKaustubh Tirpude
- 6 1 practiceUploaded byapi-343368893
- 32 Introduction to BiostatisticsUploaded byAamir AnwarAli
- Inferential AnalysisUploaded byvijilatha
- Tutorial on Statistical TestingUploaded bywieirra
- Hari Krishna KarriUploaded byREDDY
- past3.11manualUploaded byDavid Garces Mesa
- Brm II NotesUploaded byVIDITI JAJODIA
- 00144b8d73c72b773792968607da90f7585f.pdfUploaded byniclover

- ENTRENAMIENTO 8DUploaded byRene Durand
- Unit II - Parametric & Non-parametric TestsUploaded byJagadeesh Rocckz
- ch09test.rtfUploaded bycool_sp
- BSc Csit Stat 159 Second Semester SyllabusUploaded byb-chrome
- Tecuci-Overcoming_IA_Complexity.pdfUploaded byandreea_zgr
- Analysis of VarianceUploaded byapi-19916399
- Slides BackupUploaded byWahaj Kamran
- Regrerssion in BusinessUploaded byshagunparmar
- 3rdevaluationcasestudyBECGUploaded byAnjali
- MortgageProcessTime-EN.pdfUploaded byGaneshalingam Ramprasanna
- AI_Lect_11Uploaded byapi-3696125
- New Overview of Artificial IntelligenceUploaded bys73a1th
- StatisticsUploaded byCashmira Balabagan-Ibrahim
- Hypo Test563i0Uploaded byJacecosmoz
- 0495814075_288181.pptUploaded byMark Austria
- Unit Vi i Reading GuideUploaded bylidoxbecky
- cheryle lloyd lab practicalUploaded byapi-317058279
- Mat 540 Week 4 Homework Feb 1Uploaded bymsldelray
- The Design of Business Book SummaryUploaded byquest49
- Dreamblade EncyclopediaUploaded bydennisborcher
- Statistics 111 Homework 7Uploaded byAC
- Chapter 1Uploaded byPutri Amalia
- 344921875-Let-Reviewer - Copy.pdfUploaded byjohn nico pagente
- Analisis.rtfUploaded byFahriadi, M.Kes
- SPSS.regression.pcUploaded byrookieanalytics
- ELT J-1998-Gollin-88-9(1)Uploaded byNuzul Hijrah Safitri
- type II Error in hypo test.pdfUploaded bysound05
- Schema of Statistical ToolsUploaded byMary Hope A. Lima
- Store24 Data(2)Uploaded byjuju
- Research Approach - Research-MethodologyUploaded byJohn Manni

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.