Professional Documents
Culture Documents
Karen Grace-Martin
What You’ll Learn Today:
• The Four Questions you must answer to choose an appropriate statistical
method
• How they come together to help you narrow it way down
• Our focus today is on choosing the right type of model or test, not details of
that model or test
• Where to get help with the process
2
The Problem
3
1. The statistical test or model
answers the research question
What Makes a or sets you up to do so in
Statistical Method another analysis
Right?
2. The data meet all statistical
assumptions of the test
4
Four Questions to Answer for Each Analysis
5
Part 1: Part 2: Part 3: Part 4:
Define & Design Prepare & Explore Test & Refine Answer
Interpret results
13
Check and resolve 12
data issues 14
Communicate results
Collect, code, enter,
and clean data 11 Check assumptions
Estimate sample size 5 6
Run univariate and
7 bivariate descriptives
Write an analysis plan 4 Create new variables 8 10 Refine predictors
and check model fit
3 Choose the variables and 9
determine their measurement levels Run an initial model
2 Design the study or define the design
The Steps
Write out research questions
1 in theoretical and operational terms
6
An Example
I want to do the tests on a measure of Gestational Diabetes in conjunction with
Iron and/or Vitamin C supplementation, so:
Dependent Variable:
Gestational Diabetes (0 = no; 1 = yes)
Independent Variables:
Iron Supplementation (0 = no; 1 = yes)
Vitamin C Supplementation (0 = no; 1 = yes)
So from what I understand, I should be able to do a chi-squared test, since
they're all categorical variables? Is that correct, or did I miss something big?
Also, I need to validate those against a few confounding variables, namely
Age (continuous)
Body Mass Index (continuous)
Parity (continuous)
7
Question 1. What is your Research Question?
8
Theoretical Question: Key Info:
1. Comparing
Do levels of Iron and Vitamin C affect the likelihood of gestational
diabetes? four groups on
a dependent
variable
Operational Question:
2. Control for
covariates
Do women who have received Iron supplements, Vitamin C
supplements, or both, during the first trimester of pregnancy have
different likelihood of developing gestational diabetes at any point
during the pregnancy, controlling for age, BMI, and parity, compared
to those who received a standard prenatal vitamin?
9
What We Get from the Operational Research Question
If the Research Question Contains: Statistical Method Needs to be
able to include:
Predicts; Relationship between; Affects Regression modeling, usually
Controlling for; Confounding variables; Control Variables
Above and beyond
When ….; In the presence of…; Moderate Interactions
Group comparisons Categorical Predictors
10
What Makes Question 1 What Makes Question 1
Important to Answer Difficult to Answer
• Not all research questions are testable • Translating from theory to operation
• This will directly affect the design, the • Knowing what tests are available helps
variables, and the analysis
11
2. What is the design?
12
Step 2: Design Elements
13
Step 2: Design Names are Not Helpful
14
What Makes Question 2 What Makes Question 2
Important to Answer Difficult to Answer
• Some design decisions are very logical • You need to consider logistical
but make the analysis much more constraints now
difficult
• Different research questions can have
• Failing to account for design issues in different designs in the same study
the analysis will lead to inaccurate
results
• Names of designs aren’t helpful
• The design affects which research
questions you can test • Design issues can get easily complicated
15
Step 2: Define The Design Missing Info:
I want to do the tests on a measure of Gestational Diabetes in conjunction 1. Are iron and
with Iron and/or Vitamin C supplementation, so: vitamin C
Dependent Variable: conditions crossed?
Gestational Diabetes (0 = no; 1 = yes)
Independent Variables: - Assume Yes
Iron Supplementation (0 = no; 1 = yes)
Vitamin C Supplementation (0 = no; 1 = yes) 2. Are patients
So from what I understand, I should be able to do a chi-squared test, since nested within
they're all categorical variables? Is that correct, or did I miss something big? doctors or
Also, I need to validate those against a few confounding variables, namely randomly sampled?
Age (continuous)
Body Mass Index (continuous) - Assume Yes
Parity (continuous)
16
Question 3. Which variables will you use to answer
the research question and what is the scale of
measurement of each?
17
Dependent Variable Types
18
Independent Variable Types
1. Numerical 2. Categorical
19
Other Types of Terms
1. Interactions 2. Polynomials
20
What Makes Question 3 What Makes Question 3
Important to Answer Difficult to Answer
• It has a direct impact on • Data sets often contain (or could) multiple
assumptions being met versions of the same variable
• Huge impact on the difficulty of • Part of the analysis may be about creating
the statistical method chosen variables
21
Step 3: The Variables Key Info:
22
Now, pulling these together and anticipating
later steps…
23
The Data Analysis Plan Will Usually Change
24
❛❛ To consult the statistician after an experiment is finished is often
❛❛
merely to ask him to conduct a post mortem examination. He can
perhaps say what the experiment died of.
- Ronald Fisher
25
The Analysis Plan
Questions: Statistical Method Needs to be Indicates need for:
able to include:
1. Research questions - Comparing groups on DV - Some kind of ANCOVA or
- Controlling for covariates regression
2. Design - Crossed Factors - Include interaction
- Nesting of Individuals within - Mixed Model
Doctors
3. Variables - Binary outcome - Logistic Regression
- Two categorical predictors
- Three continuous covariates
26
The Analysis Plan
Questions: Statistical Method Needs to be Indicates need for:
able to include:
1. Research questions - Comparing groups on DV - Some kind of ANCOVA or
- Controlling for covariates regression
2. Design - Crossed Factors - Include interaction
- Nesting of Individuals within - Mixed Model
Doctors
3. Variables - Binary outcome - Logistic Regression
- Two categorical predictors
- Three continuous covariates
28
Data Issues
7. Zero Inflation
3. Multicollinearity
29
Step 4: Data Issues Potential Issues:
So from what I understand, I should be able to do a chi-squared test, since 4. Sample Size
they're all categorical variables? Is that correct, or did I miss something big?
Also, I need to validate those against a few confounding variables, namely 5. Lack of Variation
Age (continuous)
Body Mass Index (continuous)
Parity (continuous)
30
To Review:
Steps:
1. Write out research questions in theoretical and operational terms
2. Design the study or define the design
3. Choose the variables and determine their level of measurement
4. Write an analysis plan
7. Run univariate and bivariate statistics
12. Check for and resolve data issues
31
Poll
32
Strategies to Make this Easier
33
Bonus Guide:
34