Homework 8

Problem 1 (20%)

A research study of deaths caused by lung cancer in female smokers found the

following for 8 five-year periods:

CC -0.26 -0.03 0.3 0.37 0.4 0.5 0.55 0.55

D -2.35 -2.2 -2.12 -1.95 -1.85 -1.8 -1.7 -1.58

In the table above, CC is Log10 of the annual cigarette consumption (Lb/person) and DD

is the Log10 mortality over each 5-year period. Consider a linear model.

(a) Calculate the parameters for the linear regression model (slope and intercept).

(b) Plot the data, the mean, and the regression line. Indicate in the plot the sample

variance explained by the model and the residuals (manual annotation OK).

(c) Calculate the sum of squares (SSTOTAL, SSREGRESSION, SSRESIDUAL) and R2. Are these

values consistent with what you see in the plot? Why?

(d) Plot of the residuals on a real line. Do they look normally distributed?

(e) Assess the significance of the regression line (0?) using the F test (check Rosner

11.4).

Problem 2 (20%)

We want to investigate a possible correlation between smoking and lung cancer deaths.

Consider the dataset in the previous problem.

(a) Calculate the correlation coefficient (Pearson) for cigarette consumption and lung-

cancer deaths (use the Log10).

(b) Do you consider it statistically significant? Indicate what your null and alternative

hypotheses are and provide a p-value. Hint: check Rosner 11.8.

(c) Based on this data. Does increased smoke consumption causes an increase of lung

cancer-related deaths?

Problem 3 (20%)

An interesting question is whether or not there is a genetic component that causes the

blood pressure of a mother and a child to be correlated. To attempt to answer this

question, consider two groups of children. Children in one group live with their natural

parents, whereas children in the other group live with adoptive parents. We can

calculate the correlation between mother and child blood pressure and if these turn out

to be different in the two groups that may suggest a genetic effect on blood pressure.

Suppose there are 1000 motherchild pairs in the first group, with correlation .35, and

100 motherchild pairs in the second group, with correlation .06. Use a two-sample

Fisher Z test to assess whether the correlations between mother and child blood

pressures are indeed different. Use a significance level alpha of 0.01.

Problem 4 (20%)

We suspect the time constant characterizing the release of a drug by a nanoparticle

depends on the particle size but also could be related to deviations from a perfect

circular shape (characterized by a distortion factor q that measures the % difference

between the major and minor axis of the particle if thought of as an ellipsoid).

Unfortunately, we cant produce particles with a controlled q to test these factors

independently: it seems the deformation is more pronounced the larger the particle is.

BME 335 Fall 2017 Homework 8

For a test batch (n=31 particles), we fitted a linear model for deformation vs. particle

size (in um3) and determined the following values:

b=0.68 um-3; a=21.52

(a) Is the relationship between size and deformation statistically significant? Provide a

p-value and justify.

(b) Calculate the sample correlation r and determine if it is consistent with a population

value of =0.55. Provide a p-value and justify your answer. Make sure you answer

the question!

Problem 5 (20%)

We are testing micro-lenses for an endoscopy device made by four manufacturers. The

lenses need very precise correction collars in order to focus in the correct plane. The

following data has been collected for the correction needed in the initial batches:

Manufacturer Mean sd N

correction

A 0.435 0.058 267

B 0.424 0.061 695

C 0.428 0.062 1695

D 0.420 0.067 237

(a) Use a one-way ANOVA model to compare the means of the groups and recommend

whether to order different correction collars for each manufacturer or just buy a

single type of collar to use on all lenses regardless of the manufacturer. Use a

critical value approach with a significance level =0.05 to assess statistical

significance. Provide the ANOVA table.

(b) The means for manufacturers A and B seem to be very different. Use a t-test to

compare the means for these two groups at a significance level =0.05.

(c) Repeat (b) using a Bonferroni correction. How does the correction affect your

assessment?

