This action might not be possible to undo. Are you sure you want to continue?

# 4/21/2008

20 Exercises

Mix and Match

1. Reciprocal 2. Opposite of log 3. Elasticity 4. Constant slope 5. Changing slope a. log y = b0 + b1 log x b. 1.01 x c. exp(x) d. 1/y

e. 6. Linear pattern 7. Diminishing marginal return f. -1/2

g. 8. Simple variation 9. 1% increase h. %y/%x

i. 10. Unusual price elasticity j. y = b0 + b1 x

True/False

If you believe that a statement is false, briefly say why you think it is false. 11. Regression equations only represent linear trends in data. 12. If the correlation between x and y is larger than 0.5, then a linear equation is appropriate to describe the association. 13. To identify the presence of curvature, it can be helpful to begin by fitting a line and plotting the residuals from the linear equation.

as in y b1 log x. When returned to the original scale. Transformations in regression affect the R2 of the fitted model. What’s causing the visible diagonal € stripes that are parallel to the shown arrow? E20-2 . The reciprocal transformation is commonly applied to rates. The text mentions changes in the prices of rival brands as a lurking factor that might lead to incorrect actions when interpreting the equation that relates price and quantity. If quantity sold increases with price . 18. 16. which costs more: a single one-carat diamond or two ½ carat diamonds? Are these the same? 22.4/21/2008 20 Curves 14. Transformations such as logs or reciprocals are determined by the position of outliers in a scatterplot. the fit of a model with a transformed variable produces a curve set of predictions. If the equation in a model has a log transformation. then what does the intercept b0 tell you? € 29. the interpretation of the slope should be avoided. then what does the intercept b0 tell you? ˆ = b0 + 28. The residual standard deviation se when fitting the model 1/y = b0 + b1 x has the same units as 1/y. If the elasticity is close to zero. If an equation uses the reciprocal of the explanatory variable. Think About It 21. 19. Can you think of any lurking factors behind the relationship between weight and fuel consumption among car models? 24. what would be the elasticity? ˆ 27. According to the better model for the association between weight and mileage. The slope of a regression model which uses log x as a predictor is known as an elasticity. This plot shows the residuals from the linear equation relating mileage to weight. 15. which will save you more gas: getting rid of the 50 pounds of junk that you leave in the trunk of your compact car or removing the 50 pound extra seat from the family SUV? Are these the same? 23. how are the price and quantity related? 26. as in y = b0 + b1 1/x. such as miles per gallon or dollars per square foot. If an equation uses the log of the explanatory variable. 20. If diamonds have a linear relationship with small fixed costs. 17. Can you think of another? 25.

If the prices in the equation between price and quantity are expressed in a different currency (such as euros at 1.2 per dollar). but use a log scale for the y axis. Operating income is the difference between net sales and the cost of things sold. including both the merchandise and the labor to make the sale. including both the merchandise and the labor to make the sale. a) Create a scatterplot for operating income on Date. a) Create a scatterplot for operating income on Date. Wal-Mart This data contains quarterly operating income (in millions of dollars) of Wal-Mart from 1990 through the end of 2005 (64 quarters). Target This data contains quarterly operating income (in millions of dollars) of Target Stores from 1990 through the end of 2005 (64 quarters). Do the residuals show simple variation. Compare this view of the data to the initial plot of operating income versus date. Operating income is the difference between net sales and the cost of items sold. c) Plot the residuals from this linear trend on Date. interpret the slope and intercept.4/21/2008 10 5 20 Curves Residual 0 -5 -10 -15 2 3 4 5 6 Weight (000 lbs) 30. Interpret the slope in the fit of this equation? f) What pattern remains in the residuals from the fit of the log of operating income on date? Can you explain what’s happening? g) Which equation offers the better summary of the trend in operating income at Wal-Mart? What’s the basis for your choice? 32. or do you see patterns in both the mean and variation of the residuals? d) Create a scatterplot for operating income on date. Is the pattern linear? b) Fit a linear trend to the operating income. If we accept the fit of this equation. how does the elasticity change? You Do It 31. e) Fit the equation log (Operating Income) = b0 + b1 Date. Is the pattern linear? E20-3 .

(These data are described further in Business Analysis using Regression by Foster. interpret the slope and intercept. Interpret the slope in the fit of this equation? f) What pattern remains in the residuals from the fit of the log of operating income on date? Can you explain what’s happening? g) Which equation offers the better summary of the trend in operating income at Target? What’s the basis for your choice? 33. Do the residuals show simple variation.4/21/2008 20 Curves b) Fit a linear trend to the operating income. a) Create a scatterplot for the level of sales on the number of shelf feet. Does the relationship appear linear? Do you think that it ought to be linear? b) Fit a linear regression equation to the data. and Waterman. advertising comes in the form devoting more shelf space to the indicated product. Stine. The data include sales at 48 stores. The display space gives the number of shelf feet used to display the item. Does this fitted model make substantive sense? c) Consider a scatterplot that shows sales on the log of the number of E20-4 . but use a log scale for the y axis. Does this fitted model make substantive sense? c) Create a scatterplot for the log of the price on the rating. Compare this view of the data to the initial plot of operating income versus date. Display space Initial levels of advertising often bring a larger response in the market than later spending.) The level of sales is the weekly total sales of this product at several outlets of a chain of markets. Do these ratings affect the price? The data in this exercise are a sample of ratings and prices found on-line at the web site of an Internet wine merchant. Does this model provide a better description of the pattern in the data? e) Compare the fit of the two models to the data. Wine Influential wine critics such as Robert Parker publish their personal ratings of wines. regressing price on the rating. Does the relationship seem more suited to regression? d) Fit a regression of the log of price on the rating. In this example. or do you see patterns in both the mean and variation of the residuals? d) Create a scatterplot for operating income on Date. regressing sales on the number of shelf feet. a) Does the scatterplot of the price of wine on the rating suggest a linear or nonlinear relationship? b) Fit a linear regression equation to the data. and many consumers pay close attention. c) Plot the residuals from this linear trend on Date. If we accept the fit of this equation. e) Fit the equation log(Operating Income) = b0 + b1 Date. Can you rely on summary statistics like R2 and se? 34.

h) Compare the change in asking price for cars that are 1 and 2 years old to the difference between cars that are 11 and 12 years old. a) Do you expect the resale value of a car to drop by a fixed amount each year? b) Fit a linear equation with price as the response and age as the explanatory variable. Is this difference the same or different? 36. Can you rely on summary statistics like R2 and se? 35. Does this comparison agree you’re your impression of the better model? Should these summary statistics be compared? g) Interpret the intercept and slope in this equation. whether made by Honda or in this case. if we accept this equation’s description of the pattern in the data? c) Plot the residuals from the linear equation on Age. Show both in the same scatterplot. Does this model provide a better description of the pattern in the data? What do the slope and intercept tell you? e) Compare the fit of the two models to the data. One column gives the asking price (in thousands of dollars) and a second column gives the age (in years). Used Camrys Cars depreciate over time. Give units where appropriate.4/21/2008 20 Curves shelf feet. Do the residuals suggest a problem with the linear equation? d) Fit the equation Estimated Price = b0 + b1 log Age Do the residuals from this fit “fix” the problem found in “c”? e) Compare the fitted values from this equation with those from the linear model. These data show the listed prices of Toyota Camrys listed for sale by individuals in the Philadelphia Inquirer in an issue during 2005. f) Compare the values of R2 and se between these two equations. One column gives the asking price (in thousands of dollars) and a second column gives the age (in years). E20-5 . What do the slope and intercept tell you. Used Accords Cars depreciate over time. Toytota. if we accept this equation’s description of the pattern in the data. In particular. These data show the listed prices of Honda Accords listed for sale by individuals in the Philadelphia Inquirer in an issue during 2005. Use the equation with the log of age as the explanatory variable. What do the slope and intercept tell you. Does the relationship seem more linear? d) Fit a regression of sales on the log of the number of shelf feet. a) Do you expect the resale value of a car to drop by a fixed amount each year? b) Fit a linear equation with price as the response and age as the explanatory variable. compare what this graph has to say about the effects of increasing age on resale value.

Do the residuals suggest a problem with the linear equation? d) Fit the equation Estimated Price = b0 + b1 log Age Do the residuals from this fit “fix” the problem found in “c”? e) Compare the fitted values from this equation with those from the linear model. but it has not always been so. (That’s like treating 1984 as the start of the cellular industry in the US. In particular. What do the slope and intercept tell you. compare what this graph has to say about the effects of increasing age on resale value. an organization representing the wireless communications industry. Does the scatterplot suggest a curve of the form Estimated Log Number of Subscribers = b0 + b1 Date is a good summary? e) Create a scatterplot for the percentage change in the number of subscribers versus the year minus 1984. from 1985 through mid 2006. Cellular phones in the US Cellular (or mobile) phones are everywhere these days.) Does this plot suggest any problem with the use of the equation of the log of the number of subscribers on the date? What should this plot look like if a log equation of the form in “d” is going to be a good summary? f) Summarize the curve in this scatterplot using a curve of the form E20-6 .4/21/2008 20 Curves c) Plot the residuals from the linear equation on Age. put the response on a log scale. Show both in the same scatterplot. Use the equation with the log of age as the explanatory variable. Does the trend look like you would have expected? c) Fit a linear equation with the number of subscribers as the response and the date as the explanatory variable. f) Compare the values of R2 and se between these two equations. Does this comparison agree you’re your impression of the better model? Should these summary statistics be compared? g) Interpret the intercept and slope in this equation.” but for this plot. a) From what you have observed about the use of cellular telephones. Is this difference the same or different? 37. Give units where appropriate. track the number of cellular subscribers in the US. what do you expect the trend in the number of subscribers to look like? b) Create a scatterplot for the number of subscribers on the date of the measurement. These data from CTIA. The data is semiannual. h) Compare the change in asking price for cars that are 1 and 2 years old to the difference between cars that are 11 and 12 years old. if we accept this equation’s description of the pattern in the data? d) Create a scatterplot for the same data shown in the scatterplot done in “b.

What’s the difference in your plots? b) Fit the linear equation of the log of volume on the log of price in E20-7 . The curve is not such a nice fit. a) Using the pet food data of the text example. Does either curve appear linear? b) Create a scatterplot for the number of landline subscribers on the year of the count. Does the R2 of this fitted equation mean that it’s a good summary? c) What do the slope and intercept tell you. Pet foods. if we accept this equation’s description of the pattern in the data? d) Do the residuals from the linear equation confirm your impression of the fit of the model? e) Does a curve of the form Estimated Log Number of Subscribers = b0 + b1 Year provide a better summary of the growth of the use of landlines? Use the residuals to help decide. Cellular phones in Africa Mobile phones (as cellular phones are often called outside the US) have replaced traditional landlines in parts of the developing world where it has been impractical to build the infrastructure needed for landlines. f) Interpret the slope in the previous equation that uses the log the number of subscribers. These data from the ITU (International Telecommunication Union) estimate the number of mobile and landline subscribers (in thousands) in Sub-Saharan Africa outside of South Africa. show timeplots of the two types of subscribers versus year together. Then plot the natural log of volume on the natural log of price. How do the approximate rates of growth compare? 39. Do you think this will be a better estimate than offered by the linear equation or logarithmic curve? 38. then fit a linear equation with the number of subscribers as the response and the year as the explanatory variable. a) On the same axes.4/21/2008 20 Curves Estimated Percentage Growth = b0 + b1 1/(Date-1984) Does this curve appear to be a better summary of the pattern of growth in the domestic cellular industry? g) What’s the interpretation of the estimated intercept b0 in the curve estimated in “e”? h) Use the equation from “e” to predict the number of subscribers in the next period. and the natural log of volume on the natural log of price. The data is annual. from 1995 through 2005. but allows some comparison of the rates of growth. transform the price and volume data using natural logs and then using base 10 logs. What does it tell you about the growth of this sector? g) Fit a similar logarithm equation to the growth in the number of mobile users. revisited This exercise uses base 10 logs instead of natural logs.

d) How would the fitted equation in “b” differ had we used natural logs (base e) rather than common logs (base 10)? Does the equation fit the data any better? e) Several successful movies that grossed above $10 million at the box office have usually large.4/21/2008 20 Curves both scales. (Note: loge x = 2. Can we describe the pattern in either plot using a linear equation? b) Estimate the elasticity of subsequent sales with respect to boxoffice sales using the least squares equation with y given by the log10 of box-office sales and x given by the log10 of subsequent sales. e) Which log should be used to estimate an elasticity. For this analysis. and scatterplot the log10 of the subsequent sales on the log10 of the box office sales. and then fit the least squares line. Which movies are these and do you think that they have anything in common? E20-8 .30262 log10 x. such as those offered by cable television. negative residuals relative to the linear equation of log10 subsequent sales on log10 gross. Movies These data describe the box-office success of 407 movies released during the years 1998 through 2001.) c) Interpret the elasticity in the context of these data. Both columns in the data table are in millions of dollars. a) Create a scatterplot for subsequent sales on the box office sales. or does it matter? 40. What differences do you find between the fitted slopes and intercepts? c) Do the summary statistics R2 and se differ between the two fitted equations? d) Create a scatterplot for the loge of volume on log10 of volume. Base 10 logs are more useful than natural logs when dealing with large monetary quantities since they give us the number of digits (minus 1). we’re interested in the relationship between initial success at the movie theatre and subsequent sales for pay-per-view services. Explain how this relationship explains the similarities and differences.

4/21/2008 20 Curves 4M Cars in 1989 In order to have its cars meet the corporate average fuel economy (CAFÉ)standard of 27. a manufacturer needs to improve the mileage of two of its cars. If it can improve the mileage of either design by 2 more miles per gallon. Use the data for cars from the 1989 model year to answer these questions. R2 and se. weighs 2500 pounds. i) Provide a recommendation for management on the best approach to use to attain the needed improvement in fuel efficiency. using words instead of algebra.5 MPG. Why have you chosen this equation? f) Do the residuals from your fitted equation appear simple? Do any outliers stand out? g) Compare the fit of this equation to that used in this chapter to describe the relationship between weight and mileage for more recent cars. Describe the association between these variables. weighs 4000 pounds. and the city mileage is expressed in miles per gallon. its cars will meet the federal standards. E20-9 . a 4door family sedan. Motivation a) Which of these two models should the manufacturer modify? In particular. a small sports car. The weight of the cars is measured in thousands of pounds. if the manufacturer needs to reduce the weight of a car to improve its mileage. Include in your comparison the slope and intercept of the fitted equation as well as the two summary measures. One design. Message h) Summarize the equation developed in your modeling for the manufacturer’s management. what sort of relationship do you expect to find between weight and mileage (city driving) for cars from the 1989 model year? Linear or curved? c) In order to choose the equation to describe the relationship between weight and mileage. e) Fit an equation using least squares that captures the pattern seen in this data. will you be able to use summary measures like R2 or will you have to rely on other methods to pick the equation? Mechanics d) Create a scatterplot for mileage on weight. The other model. how can an equation that relates weight to mileage help? Method b) Based on the analysis in this chapter for modern cars.

summary statistics. The housing prices for each community are the median selling prices for homes sold in the prior year. Describe the association. Interpret the slope.4/21/2008 4M Crime and Housing Prices in Philadelphia 20 Curves Modern housing areas often seek to obtain. fit an equation that uses housing prices as the response with the reciprocal (1/x) of the crime rate as the explanatory variable. These data from Philadelphia Magazine summarize crime rates and housing prices in communities near and including Philadelphia.000? E20-10 . Is it strong? What is the direction? f) Fit the linear equation of housing prices on crime rates. d) Do you anticipate differences in the level of crime to be linearly related to differences in the housing prices? Explain in the context of your answer the underlying implication of a linear relationship. Exclude the data for Center City. Mechanics e) Create a scatterplot for the housing prices on crime rates. g) As an alternative to the linear model. Method c) For modeling the association between crime rates and housing prices.000 have the same impact (on average) on housing prices as the change from 11 to 12 per 100. and summary statistics (R2 and se). low rates of crimes. explain why a community leader should consider crime rates the explanatory variable. or at least convey. Philadelphia from this analysis. i) Interpret the equation that you think best summarizes the relationship between crime rates and housing prices. The amount of commercial activity produces a very large crime rate relative to the number of residents. intercept. What is the natural interpretation of the reciprocal of the crime rate? Message h) Which model do you think offers the better summary of the association between crime rates and housing prices? Use residual plots. It’s a predominantly commercial area that includes clusters of residential housing.000 people living in the community. Motivation a) How could local political and business leaders use an equation that relates crime rates to housing values to advocate higher expenditures for police? b) Would an equation from these data produce a causal statement relating crime rates to housing prices? Explain. and substantive interpretation to make your case. Homeowners and families like to live in safe areas and presumably are willing to pay a premium for the opportunity to have a safe home. The crime rate variable measures the number of reported crimes per 100. j) Does an increment in the crime rate from 1 to 2 per 100.