You are on page 1of 2

Dataset description 1

Research dataset 1 (dataset1.csv) contains 50,000 electronic health records with the following
variables:

Variable Description
id ID
sm_status Smoking status
sex Sex
age Age at baseline
cancer Prevalent cancer at baseline
dementia Prevalent dementia at baseline
diuretics Use of diuretics at baseline
bmi Body mass index (in kg/m2) - measured at baseline
died Died during follow up
date_baseline Date at baseline
date_end_fu End of follow-up date

Research major goal 1


What is the relationship between BMI and mortality, after accounting for other measured risk
factors?
Research minor goals 1
 Describing how each variable is treated, and whether any continuous variables needed to be
categorised into clinically significant groups.
 Using an appropriate model to explore the relationship between BMI and mortality, with a
discussion of assumptions and why the model was chosen.
 Describing how missing data was treated, assumptions made and whether a sensitivity
analysis was conducted to determine robustness to these assumptions.
 An interpretation of the results addressing the research major goal.

Dataset description 2
Research dataset 2 (dataset2.csv) contains 10,000 electronic health records of smokers with the
following variables:

Variable Description
id ID
sex Sex
age Age at baseline
education level education: values 1 (low) to 5 (high)
n_cigarettes N of cigarettes smoking (per day)
CVD Prevalent CVD at baseline
dementia Prevalent dementia at baseline
diuretics Use of diuretics at baseline
bmi Body mass index (in kg/m2) - measured at baseline
smoking_cessation Quit smoking between baseline and the end of follow-up
bmi_ch_percent BMI change (%)

Research major goal 2


What is the effect of stopping smoking (smoking cessation) on BMI change (after 5 years)?
Research minor goals 2
 Descriptive presentation of the dataset
 Using an outcome regression model to estimate the relationship between stopping smoking
and BMI change:
o While adjusting for other variables
o While not adjusting for other variables
 Using universe probability weighting to adjust for baseline confounder and to estimate the
relationship between stopping smoking and BMI change
 Using g-formula to estimate the BMI change:
o If nobody stopped smoking
o If everyone stopped smoking
o Measure the average causal effect of stopping smoking on BMI change
 Comparing the findings between the different methods, discussing assumptions for
estimating the average causal effect between stopping smoking and BMI change, and
potential bias within the study

You might also like