Professional Documents
Culture Documents
Basic Bio Statistics For Clinicians
Basic Bio Statistics For Clinicians
Experimental design
Motivating example
Types of variables
Descriptive statistics
Population vs. sample
Confidence intervals
Hypothesis testing
Type I and II errors
Experimental Design
{
{
Controlled designs
The experimenter has control
{ exposure or treatments
{ Randomized clinical trials
{
Observational designs
Cohort studies
{ Case-control studies
{
Controlled Designs
{
{
{
{
Controls biases
balances treatment arms
Process:
z
z
z
z
z
Pros
z
z
Identify a cohort
Measure exposure
Follow for a long time
See who gets disease
Analyze to see if disease is associated with exposure
Measurement is not biased and usually measured
precisely
Can estimate prevalence and associations, and relative
risks
Cons
z
z
z
Very expensive
Very very expensive if outcome of interest is rare
Sometimes we dont know all of the exposures to
measure
Process:
z
z
z
Pros
z
z
z
Cons
z
z
Observational Studies:
Why they leave us with questions
Confounders
Biases
z
z
z
z
Self-selection
Recall bias
Survival bias
Etc.
Motivating example
{
{
{
{
{
Study Endpoints
Intraoperative
Post-surgery
>48 hours
to 48 hours through day 8
Total
Allo
Auto
Allo + Auto
FFP
Platelets
All products
continuous:
{
{
{
{
categorical
{
{
{
blood pressure
cholesterol
quality of life
units of blood
blood type
transfused/not transfused
cured/not cured
time-to-event
{
{
{
{
time
time
time
time
to
to
to
to
death
progression
immune reconstitution
discharge(?)
.1
Density
.15
.2
.25
.05
What is skewness?
10
allblood08
15
20
.25
2.42 units
.1
Density
.15
.2
.05
x=
0
10
allblood08
15
20
1
N
x
i =1
.05
.1
Density
.15
.2
.25
10
allblood08
15
20
.05
.1
Density
.15
.2
.25
s=
0
10
allblood08
15
20
1
N 1
2
(
x
x
)
i
i =1
Others
{
{
{
{
Range
Interquartile range
Mode
Skewness
{
{
Focus on binary:
1 if units > 5
y=
0 if units 5
Freq.
Percent
------------+----------------------------0 |
106
58.24
1 |
76
41.76
------------+----------------------------Total |
182
100.00
y |
A key distinction:
Population versus Sample
{
{
{
{
{
{
Population
of
interest
sample
{
{
A parameter is a population
characteristic
A statistic is a sample characteristic
Example: we estimate the sample
mean to tell us about the true
population mean
z
z
Statistical Inference
{
{
{
Confidence Intervals
{
{
{
{
{
Confidence Intervals
{
{
{
{
Standard Error
{
{
{
{
s
sex =
N
Confidence Intervals
{
x 1.96 sex
{
Or,
multiplier
s
x 1.96
N
It is an interval that
we are 95% sure
contains the true
population mean.
It provides a
reasonable range
for the true
population parameter
Example: EACA and
placebo
Placebo :
x = 2.81, s = 2.81, N = 91
2.81
2.81 1.96
= (2.23,3.40)
91
EACA :
x = 2.04, s = 1.83, N = 91
1.83
2.04 1.96
= (1.67,2.41)
91
{
{
Caveats
{
EACA example:
z
z
z
Very skewed
But, N=91 per group
If N=25 instead, confidence interval
would not be valid
Caveats
{
{
{
For
For
For
For
N=20:
N=50:
N=100:
N=1000:
multiplier = 2.09
multiplier = 2.01
multiplier = 1.98
multiplier = 1.96
Confidence Intervals
{
We can make
confidence intervals for
any parameter
We just need:
z
z
{
{
Sample estimate
Standard error of
estimate
(a little theory)
Example: proportion
Width ALWAYS depends
on sample size!!!
Placebo :
46
p =
= 0.51
91
95% CI = (0.40, 0.61)
EACA :
30
p =
= 0.33
91
95% CI = (0.23, 0.44)
Hypothesis Testing
{
H0: 1 = 2
H1: 1 2
In words:
z
z
Null distribution
Null distribution
{
{
{
{
{
Two sample
one sample
Paired
Two-sample t-test
{
{
No mathematics here
Just know that the following are included:
z
z
z
{
{
{
T-distribution
{
{
0.4
Degrees of freedom
0.0
0.1
0.2
0.3
t(2)
t(5)
t(25)
Normal
-4
-2
T-distribution
Represents the null distribution
Observations in the bulk of the curve are things that
would be common if the null were true
extreme observations are rare under the null
{
{
0.4
0.0
0.1
0.2
0.3
t(2)
t(5)
t(25)
Normal
-4
-2
Two-sample t-test
t-statistic = 2.19
Total N=182
Use N to determine degrees of freedom
Relationship between N and degrees of
freedom depends on type of t-test
z
z
1.
2.
0.0
0.2
0.4
2.19
-4
-2
1.
2.
3.
2.19
0.0
0.2
0.4
-2.19
-4
-2
1.
2.
3.
4.
2.19
0.0
0.2
0.4
-2.19
-4
-2
1.
2.
3.
4.
5.
0.2
0.4
-2.19
2.19
0.015
0.0
0.015
-4
-2
The p-value
{
{
{
as or more extreme
..if the null is true
Statistic is calculated based on the null
distribution!
Now what?
{
{
{
{
{
AD HOC cutoff
DEPENDS HEAVILY ON SAMPLE
SIZE!!!!!!!!!!!!
alpha = 0.10
Alpha = 0.01
Phase of study
How many hypotheses you are testing
{
{
Type II error
{
{
{
{
{
{
{
QUESTIONS???
{
Contact me:
Elizabeth Garrett-Mayer
garrettm@musc.edu
Other resources
z
z
z