Professional Documents
Culture Documents
Causation
Randomized, comparative experiments are intended to give good evidence that
differences in the treatment are caused observed differences
....
Reducing Impact of Random Chance
No conclusion based on statistical analysis is 100% certain
Every individual in a study will be different
We need to use enough subjects to reduce chance variation that would create
differences due to randomization alone
Fundamental Principles of Design
Control: limit the effects of lurking variable and uninteresting sources of variation
o Use two or more treatments
o There are many other techniques to control for this
o
Statistical Significance
We create a range of values for possibly observed effects due to random chance only
If the treatment effects are outside this range, we say the observed effects are
statistically significance
Matched Pairs Design
Completely randomized designs are the simplest experimental design like how SRS is
the simplest sampling design
In some cases, it makes sense to systematically group subjects based on similar, known
characteristics
Matched pairs design: compares exactly two treatments, either by using a pair of
individuals (that are closely matched) or by using each individual twice
Randomize two treatments within each pair or randomize order if same individual
Block Designs
A matched pair is a special case of a block design
Block: a group of individuals that known before the experiment to be similar in some way
that is expected to impact the effect of the treatments on the response variable
Block design: design in which random assignment . . .
Treatments and Blocks
The treatments are not randomly assigned to the blocks; they are assigned to the
individuals within each block
Other Experimental Considerations
Blind study
Double-blind study
Replication
Using multiple copies of the same treatment gives you a better idea of the effect of that
treatment
Replication gives an idea of the variation associated with the treatment
The most convincing evidence for causation comes from replicating the design in
different locations and independent investigators
9/9/16
Concerns with Experimentation
Lack of realism
o Controlling variation can limit conclusions
o Cannot generalize results to population
Ethics
o Informed consent is required
o Review board
o Data must be kept confidential
Mean
Most common metric of central tendency
Distribution is known
o How often you can expect to see one value
Median
Another measure of central tendency
Sample proportion
Proportion of sample observations in a category
5 number summary
Min, Q1, median, Q3, max
Boxplot
Standard deviation
Overlapping Distributions
For quantitative variables, we looked at their distribution
Overlapping Histograms
Basically the same thing as multiple dot plots, but better for large data
Side-by-Side Boxplots
Using the previous methods is overwhelming for multiple groups and large data
Side-by-side boxplots are the most useful here
Quantitative vs Categorical
Measuring the relationship compares the distribution of the quantitative variable at
different values of the categorical variable
Mainly interested in comparing mean and variance
Explaining Variance
What if we ignored the categorical variable and estimated the mean and variance of the
quantitative variable?