You are on page 1of 2

● geom_jitter() draws a jittered dot plot.

The jittering is to add randomly generated small


numbers around 0 to the data to visually separate observations
● dataframe$variable_name extracts a variable in the data frame (data set)
Paired t-test and Introduction to Design of Experiments

Hypothesis testing
● Two competing hypotheses: null hypothesis H0 and alternative hypothesis Ha
● We want to get an evidence to reject the null hypothesis to claim Ha is true
● The p-value is a measure of strength of the evidence from the dataset against H0
○ If p value is greater than *alpha*, we do not have enough evidence to reject H0
■ If it is smaller then H0 is rejected

Design experiment: people or things are assigned to groups with one group receiving treatment
and the other being a control

Observational study: measure or study information without affecting the sample

Observational data - collection of measurements on predictor and response variables as they


naturally occur

● In observational studies, establishing causal connections between response and


predictor variables is nearly impossible
● The best one we can hope for is to establish associations between predictor variables
and response variables
● But even this can be difficult due to the uncontrolled nature of observational data
● Its because unmeasured “lurking” (confounding) variables may be the real cause of an
observed relationship between response Y and some predictor X
EX: IN A MEASURE OF HEIGHT VS GRADES OF CHILDREN, CHILDREN WHO
WERE TALLER SCORED BETTER ON MATH. THIS IS BECAUSE THE LURKING VARIABLE
OF AGE

● In a designed experiment, the researcher has control over the settings of the predictor
variables
● Terminology
○ Factors - categorical variables, predictor
○ Levels - categories in categorical variable
○ Experiment unit (EU) - the physical entity which can be assigned, at random, to a
treatment
○ Treatments - something that researchers administer to experimental units, they
are a combination of levels of factors
○ Response - measurement from experiment units

● Three characteristics of well-designed experiment


○ Control
○ Randomization
○ Replication
● Control group
○ A baseline group that receives no treatment
● Placebo
○ Fake treatment
● Blinding
○ Not telling participants whether they are receiving a placebo

● When there is a single factor whose levels may only change between different
experimental units, we can analyze the effect of the factor on the response by using a
one-way analysis of variance (one-way ANOVA)
● Compare the population means of more than two populations. Thus, it is a generalized
independent two sample t-rtest
○ H0 : u1 = u2 = … = uk
○ Ha : at leas the p

You might also like