You are on page 1of 28
Canta CHAPTER 11 11.9 EQUATION 11.28 Regression and Correlation Methods REVIEW QUESTIONS 11C¢ 11 What is the difference between the one-sample ¢ test for correlation coefficients and the one-sample z test for correlation coefficients? 2 Refer to the data in Table 2.13 on page 98. (2) What test can be used to assess whether there is a significant association between the first white-blood count following admission and the duration of hospital stay? (b) Implement the test in Review Question 11.2a, and report a two-tailed p-value, (©) Provide a 95% confidence interval for the correlation coefficient in Review Question 116.2. 3. Refer to the data in Table 2.13 (p. 36). (@) Suppose we want to compare the correlation coefficient between duration of hospital stay and white-blood count for males vs. females. What test can we use to accomplish this? (b) Perform the test in Review Question 11C.8a, and report a twoxialed p-value. MULTIPLE REGRESSION In Sections 11.2 through 11.6 problems in linear-regression analysis in which there is one independent variable (x), one dependent variable (), and a linear relationship between x and y were discussed. In practice, there is often more than one independ- ‘ent variable and we would like to look at the relationship between each of the inde- pendent variables (x,....%) and the dependent variable (y) after taking into account the remaining independent variables. This type of problem is the subject matter of multiple-regression analysis. Hypertension, Pediatrics A topic of interest in hypertension research is how the relationship between the blood-pressure levels of newboms and infants relate to sub- sequent adult blood pressure. One problem that arises is that the blood pressure of a newbom is affected by several extraneous factors that make this relationship difficult ‘to study. In particular, newborn blood pressures are affected by (1) birthweight and (2) the day of life on which blood pressure is measured. In this study, the infants were ‘weighed at the time of the blood-pressure measurements. We refer to this weight as, the “birthweight,” although it differs somewhat from their actual weight at birth. Because the infants grow in the first few days of life, we would expect that infants seen at $ days of life would on average have a greater weight than those seen at 2 days of life, We would like to be able to adjust the observed blood pressure for these two factors before we look at other factors that may influence newborn blood pressure. Estimation of the Regression Equation Suppose a relationship is postulated between systolic blood pressure (SBP) (y), birth- ‘weight (x,), and age in days (x,), of the form YHo+Bim Bx te where ¢ is an error term that is normally distributed with mean 0 and variance o ‘We would like to estimate the parameters of this model and test various hypotheses 119 Multiple Regression S03 ‘concerning it. The same method of least squares that was introduced in Section 11.3 for simple linear regression will be used to fit the parameters of this multiple- regression model. In particular, a, f,, B, will be estimated by a, b, and b,, respec- tively, where we choose u,b, and b, to minimize the sum of [y-Ga+ bx; +0.%2)P over all the data points, In general, if we have k independent variables x,..., x, then a linear-regression model relating y to X,,.., %, is of the form & EQUATION 11.29 yaar DBaj+e where ¢ i an error term that is normally distributed with mean 0 and variance o* We estimate dy BoB, DY dy BysnyBy using the method of least squares, where we minimize the sum of 1 oP [»-(+-2e) Hypertension, Pediatrics Suppose age (days), birthweight (07), and SBP are measured for 16 infants and the data are as shown in Table 11.8. Estimate the parameters of the multiple-regression model in Equation 11.28, TABLE 11.8 — Sample data for infant blood pressure, age, and birthweight for 16 infants Age Bitweight ‘SEP i (aay) eu) Gm Ha) 1 3 135 80 2 4 120 90 3 3 100 83 4 2 108 ” 5 4 130 92 6 5 128, 98 ? 2 125, 22 @ a 105 25 ° 5 120 96 10 4 90 95 " 2 120 20 12 3 95 79 13 3 120 86 14 4 150 97 15 3 160 92 16 3 195: 28. Solution: Use the SAS PROC REG program to obtain the least-squares estimates. The results are given in Table 11.9. According to the parameter-estimate column, the regression equation is given by Y= 53.45 + 5,894; + 0.126%, 504 CHAPTER 11__ Regression and Correlation Methods TABLE 11.9 Least-squares estimates of the regression parameters for the newborn blood-pressure data in Table 11.8 using the SAS PROC REG program ‘the REG Procedure Model: MODERA Dependent Variable: sysbp Number of Observations Read 16 Number of Observations Used 16 Analysis of Variance sum of mean, Source or Square: Square Moder 2 591.03564 —295.51782 Error a 79.90186 6.14630 Corrected Total 15 670.93750 Root MSE 2.47917 ReSquare: Dependent Mean, 06250 Aa RS Coot var 2.81526 Paranever Estinates Paranever standard variable or Ravinace Error value Pe > [el intercent 2 53.45018 4.53289 aa.79 <,0002 ageays a 5.88772 o.stoza, 8.66 <,0002 brehegt 2 0.22558 0.03434 3.66 0.0029 pro <-0002 Squared Partial corr type 11 0.85216 0.50735, ‘The regression equation tells us that for a newborn the average blood pressure increases by an estimated 5.89 mm Hg per day of age, and 0.126 mm Hg per ounce of birthweight. Hypertension, Pediatrics Calculate the predicted average SBP of a three-day-old baby with birthweight 8 Ib (128 07). Solution: The average SBP is estimated by 53.45 + 5.89(3) + 0.126(128) = 87.2 mm Hg ‘The regression coefficients in Table 11.9 are called partial-regression coefficients. DEFINITION 11.18 Suppose we consider the multiple-regression model +DBixre a y where ¢ follows a normal distribution with mean 0 and variance o*. The f, j= 1, 2,.., k ate referred to as partial-regression coefficients. {represents the average increase in y per unit increase in x, with all other variables held constant (or stated. another way, after adjusting for all other variables in the model), and is estimated by ‘the parameter b,

You might also like