You are on page 1of 3

Dambi Dollo University

Advanced Biostatistics (PUBH 5021)


Group Assignment

Instruction:
1. Attempt all the questions and write a detailed report of the result of your analysis
2. Submit your report to: tarikud@gmail.com
3. The report shall have a cover page that includes full name and ID of group
members
4. The report shall be formatted appropriately and be submitted in pdf format
5. Authenticity of your report will be evaluated; thus, you should avoid copying your
report from other groups or friends
6. Data to be used for each question are attached and you need to carefully examine
the data before advancing to the analysis and report writing
7. There is a deadline for submission: You can submit before or on the final exam
date.
Question 1

The “HypertensionData.sav” data has records of 381 subjects who came to medical
clinics in 8 villages, for a variety of complaints. Data on gender, age, systolic and diastolic
blood pressure, and village were collected. The aim of the analysis is to compare the
systolic blood pressure by gender, age and village (separately for each of these
variables).

1.) Write a brief report on


a. Type of test you will need to use to compare the systolic blood pressure by
gender, village and age
b. Why you chose the test, and
c. The result of your analysis

Question 2

The “BirthWeightData.sav” data has newborn and parental characteristics of 42


newborns. Use any relevant types of data summarization technique (numerical summary
measures and graphs) and explore any form of relationship birth weight of a baby has
with potential parental characteristics (Any type of descriptive bivariate analysis can be
utilized).

1. Produce a maximum of one page report for the exploratory analysis you have
made
2. Select potential predictors based on your exploratory analysis and fit a linear
regression model to the data and describe the result
3. Use your model and produce the predicted birth weights and residuals for a
newborn with id = 27
4. After fitting the model, you also need to check assumptions of a linear regression
model such as multicollinearity, homoscedasticity, and normality and comment on
the result
Question 3

The following questions shall be answered based on the “CVDData.sav”. In this data file,
incidence of heart attack is the dependent variable. Use any kind of analysis that you
think best suits the data (from as simple as chi-square to regression analysis) to answer
the following questions

1. Use regular health checkup, BMI, age and exercise as predictors and describe
your finding (NB: Both crude and adjusted coefficients shall be reported and the
variable bmi should be included as a categorical variable).
2. In place of exercise, use intensity of exercise as a predictor and describe the result
of the revised model
3. Has the performance of the model improved? How did you make the judgment or
arrive at the conclusion?

Question 4

The second part of this section relies on “WomenHealthData.dta” data. The data has
6000 records. The main interest variable is the health status of women measured in
ordinal scale. Use this data file and make all the descriptive analysis of the variables. In
a follow up, use an appropriate method of analysis that would best suit your main interest
variable and interpret the finding.

You might also like