DETECTING NONLINEAR PATTERNS ..................................................................................... 20-3 Scatterplots ..................................................................................................................... 20-3 Residual Plots ................................................................................................................. 20-4 RECIPROCAL TRANSFORMATIONS ........................................................................................ 20-6 COMPARING LINEAR AND NONLINEAR EQUATIONS ............................................................. 20-7 Visual Comparisons ........................................................................................................ 20-8 Substantive Comparison ................................................................................................. 20-9 LOGARITHM TRANSFORMATIONS ....................................................................................... 20-11 Scatterplots and Residual Plots .................................................................................... 20-12 COMPARING EQUATIONS .................................................................................................... 20-13 SUMMARY .......................................................................................................................... 20-18
Figure 20-1. Average retail price of regular unleaded gasoline in the US, in dollars per gallon.
An increase in the price of gasoline is a painful reminder of the laws of supply and demand. The first big increase in domestic prices struck in 1973-1974 when the Organization of Petroleum Exporting Countries (OPEC) set production quotas. Gasoline prices had been nearly constant for so long that no one kept track of prices before then. These data start in 1975, shortly before the surge that began in 1979. After selling for 60 to 70 cents a gallon during the 1970’s, the average price soared above $1.40 per gallon in 1981.
2005 Ford Escape MPG 31 City / 36 Hwy
Congress noticed the importance of energy prices as well. The legislative response included the Energy Policy and Conservation Act of 1975. This act established corporate average fuel economy (CAFE) standards for passenger vehicles sold in the US. The current standard for cars is 27.5 mpg (light trucks need to average 20.7 mpg). Until recently, fuel efficiency hadn’t received much legislative attention since then. More recent price increases have renewed interest in improving the efficiency of cars. One way to improve mileage is to reduce the weight of the car. Lighter materials, however, cost more than heavier materials of comparable strength. Aluminum and composites are more expensive than steel. Companies want evidence of the benefit before they sink money into lightweight materials. What sort of improvements in mileage should a manufacturer expect from reducing the weight of a car by, say, 200 pounds? That’s not much for one car, but the costs and benefits add up when you make millions. To answer this question, we’ll use a regression model. Unlike models in Chapter 19, this model has to capture a pattern that bends.
2005 Mercedes G55 MPG 12 City / 14 Hwy
ask yourself the question. Before we accept this claim. regardless of the size of the diamond. or do these show bends?
€ a problem with a linear equation The second opportunity to recognize comes when we see plots. There are two ways to recognize problems with linear equations: one you should do before you look at the data and the other you judge from a plot. Before you start working with data and plots. A quick visit to a jeweler should convince you. Each increase in weight by one tenth of a carat increases the estimated price by $267. a linear equation nicely describes the association between weight and price of small diamonds in Chapter 19. This scatterplot graphs mileage (in miles per gallon) versus weight (in thousands of pounds) for 232 different types of passenger vehicles produced in the 2003 and 2004 model years. Mileage versus weight for passenger vehicles sold in the US. we need to look at data. do you expect the effect of weight on mileage to be the same for cars of all sizes? Does trimming 200 pounds from a big SUV have the same effect on mileage as trimming 200 pounds from a small compact? If we model the relationship between mileage (y) and weight (x) using a ˆ = b0 + b1 x has one slope.
Does a linear equation work? Think: Should equal changes in x bring equal changes in the response? Look: Are the plots of the data straight enough. they become scarcer and increments in size command ever larger increments in price. For cars. that the difference in price between ½ carat diamonds and 1-carat diamonds is smaller than the gap between 1. line. but they don’t work in every situation.” A fitted line y regardless of x.5 and 2-carat diamonds.”
40 30 20 10 2 3 4 5 6
Weight (000 lbs)
Detecting Nonlinear Patterns
Linear patterns are a good place to begin when modeling dependence. have a reason to separate them from the other cases – a reason deeper than “they don’t fit. For example. As diamonds get larger. then our answer is “yes.
Exclude outliers? Before you do. “Why should changes in x come with constant changes in y?” Linear association means that a change in x on average is associated with a constant change in y. however.
The intercept tells us that a “weightless car” gets pretty good mileage: 35. without them.) The linear pattern attributes more than half of the variation in mileage to differences in weight (r2 = 57%). the scale narrows to 7. the y-axis in Figure 20-2 ranges from 5 all the way to 65 MPG. Let’s set these two aside and concentrate on the cars with regular engines. This equation says that the heavier cars average 4. the scatterplot of mileage on weight shows details hidden by the scale of Figure 20-2. It is better to interpret the intercept as reminder that cars burn fuel regardless of their weight. As weight goes up. the scatterplot shows negative association. (Weight is measured in thousands of pounds. The car with the highest mileage is the Honda Insight.9 MPG.52 MPG per thousand pounds.
Figure 20-3.52 fewer miles per gallon. The least squares regression line shown in the right panel of Figure 20-3 is Estimated MPG = 35. Cars in one group weigh 1000 pounds more than cars in the other group. Is the pattern in Figure 20-3 linear.5 to 37. The outlier with the second highest mileage is another hybrid.
Because the fitted line in Figure 20-3 explains more than half of the variation in mileage. the Toyota Prius.5 MPG. First.6 MPG. such as the energy used to power the air conditioning and electronics. or does it bend? Without a line to provide a frame of reference. it’s hard to tell. As often seems the case.52 Weight To interpret this equation. Second. without and with the fitted line. With units attached.4/21/2008
2004 Honda Insight MPG 61 City / 66 Hwy
Two things stand out in this plot. Zooming in on the data without the outliers.
To accommodate the outlying hybrids. The standard deviation of the residuals around the fitted line is se = 2. two outliers in the upper left corner have much higher mileage than the rest. Without the outliers.6 – 4. Hybrids combine a gasoline engine with an electric motor to get exceptional mileage. an early hybrid. b1 = –4. the intercept is an extrapolation. This equation does not merit such extrapolation. it may be hard to see that these data fail the straight 20-4
. consider comparing the mileage of two groups of cars. zooming in for a better look at the other 230 vehicles. mileage goes down.
To make a comparison.
The sketched curve suggests a bending pattern. then it should not matter which weight goes with which residual. This curvature also appears in scatterplot of MPG on Weight in Figure 20-3.
Residuals MPG City
5 0 -5 -10 -15 2 3 4 5 6
Residuals MPG City
5 0 -5 -10 -15 2 3 4 5 6
Figure 20-5. The scales are the same. and again too low at the right for large cars. negative in the middle. but the residual plot makes it more apparent. 20-5
. build another scatterplot that scrambles the weights so that each residual gets matched to a randomly chosen weight. After removing the linear trend. above most in the middle. The residuals are generally positive on the left (light cars). The lack of fit becomes more apparent in the residual plot. The following figure shows two such scatterplots. you can usually find it in the original scatterplot of y on x as well. Checking for simple residuals. To confirm the presence of a pattern in the residuals.4/21/2008
enough condition. If there’s no pattern. but the content is visually rather different. Residual plot with sketched-in bending curve. the curving pattern is more evident. we need to find a nonlinear equation to represent the dependence. Because the association in the data is not straight enough. and positive at the right (heavy cars). you can see that the line passes below most of the points that represent light cars. If you look closely at Figure 20-3.
The residual plot in Figure 20-4 is distinct from these. That means there’s a pattern in the residuals. use the visual test for simplicity (Chapter 6).
0 -5 -10 -15 2 3 4 5 6
Weight (000 lbs)
Figure 20-4. The plot seems to grin at us. Once you see the pattern in the residual plot.
How’d you know to try that? (1) Experience (2) Tried several others (3) Context. The pattern looks linear but for a scattering of outliers that are less obvious in Figure 20-3. Reciprocals work well when dealing with rates or ratios. We will replace MPG by the number of gallons that it takes to go 100 miles. That’s how they measure mileage in Europe.
The scatterplot includes the least squares line. The reciprocal transformation measures gallons per mile. The reciprocal transformation converts the data di in the ith row of the data table into its reciprocal 1/di. Any guess which cars have very high fuel consumption for their weights? 20-6
. Transforming to a simpler pattern.
13 12 11 10 9 8 7 6 5 4 3 2 2 2. If you cannot interpret an equation. Similarly. A transformation is a function that is applied to a column of data. Europeans measure fuel efficiency on a similar scale in liters per 100 kilometers. miles per gallon. Other transformations often produce an equation that is beyond interpretation.5 3 3. Heavier cars consume more gas.) This scatterplot graphs fuel consumption in gallons per 100 miles versus weight. The association is positive because the response measures fuel consumption rather than fuel efficiency. By connecting reciprocals and logarithms to the substance of problems. (Multiplying by 100 gives a more useful range of values. We can apply transformations to the explanatory variable or to the response. the log transformation converts di into log di. how are you going to use it? A reciprocal transformation captures the bending pattern between MPG and Weight. By applying a transformation to either x or y (or both). This transformation is 100 × 1/MPG.5 5 5.5 4 4. we can build nonlinear equations that describe the relationship better than a linear equation.5 6
Weight (000 lbs)
Figure 20-6. and logs connect patterns to the underlying economics and percentages. so the slope is positive.4/21/2008
d1 d2 d3 d4 … log d 1/d
1/d1 1/d2 1/d3 1/d4 … log d1 log d2 log d3 log d4 …
Transformations expand the scope of regression modeling and allow us to describe patterns that curve. We concentrate on two transformations that are most useful in business applications: reciprocals and logarithms. The reciprocal is interpretable because MPG is a ratio. we can interpret nonlinear equations that use these transformations.
it needs 8 more gallons to go 100 miles than typical cars of its weight. The use of a transformation also changes the interpretation of the slope and intercept. notice that the equations use different responses.
9 8 7 6 5 4 3 2 1 0 -1 -2 2 2. 57% to 41%.5 5 5. The same goes for comparisons of the residual 20-7
.000 pounds more than those in the other. It’s more interesting if we interpret it as the fuel burned regardless of the weight.5 3 3. The slope in this equation means that. a Masarati. The Ferrari Enzo has the largest positive residual. That’s not a meaningful comparison.4/21/2008
The previous outliers are fuel-efficient hybrids.5 4 4. The intercept naively speaks of weightless cars. Explaining 57% of the variation in mileage is not comparable to explaining 41% of the variation in fuel consumption. these gas-guzzling outliers are at the other end of the scale. The outliers are even more apparent in the following residual plot. Residuals from the regression using a reciprocal of y.21 more gallons to drive 100 miles. on average. the heavier cars use 1.5 6
2003 Ferrari Enzo MPG 8 City / 12 Hwy
Weight (000 lbs)
Figure 20-7. To interpret b0 and b1 in this equation. Before you conclude this is a meaningful comparison.1 gallons when driving 100 miles to run air conditioning and other conveniences.
The outliers are exotic sports cars: a Ferrari.21 Weight In spite of these outliers. regardless of the weight.11 + 1. an engine that generates that kind of power burns a lot of gas. but in terms of gallons per 100 miles. and an Aston Martin. It also packs 660 horsepower. The intercept remains a big extrapolation if we think of it as a prediction. Only compare r2 between regression equations that use the same response. No matter how light you make the car. The r2 of the initial linear equation is higher. These cars use about 1. we’ve constructed an equation that appears more linear than the initial fit. By transforming miles per gallon into gallons per 100 miles. the relationship appears linear.
Comparing Linear and Nonlinear Equations
Which equation is better? It is tempting to choose the fit with the larger r2. think again about two groups of cars. The equation for the least squares line in Figure 20-7 is Estimated Gallons/100 miles = 1. a Lamborghini. Cars in one group weigh 1.
8 7.11 + 1.53 gallons) = 28.37 = 11.9
Change in MPG 7. let’s work out a few fitted values. With MPG as the response. the reciprocal equation produces a curve.0 8.5 3 3.5 6
Weight (000 lbs)
2004 Hummer H2 MPG ??
Figure 20-8.00 8.74 = 21.4/21/2008
standard deviations.52 13.
. Then.52 × 2 = 26.9 miles per gallon.48
Reciprocal Estimated gal/(100 mi) Estimated MPG 1.53 = 28.3 2.37 100/8.3 4. The equation using the reciprocal is Gallons Estimated = 1. we’ll first compare them visually to see which is a better match to the data. [ align last column at row breaks ]
The following scatterplot shows the linear fit and the bending fit from the reciprocal equation (green). The line from Figure 20-3 is Estimated MPG = 35.3 miles per gallon. The two standard deviations are on different scales.11 + 1. To decide which equation is better.53 100/3.52 miles per gallon.
25 20 15 10 2 2.04 17.2 4.16 = 14. we’ll think about which equation makes more sense given what we know from the context.5 4 4. With gallons per 100 miles as the response.1
Table 20-1.95 100/5.74 100/4. To draw the curve.4.21 × 2 = 3. For a small car that weighs 2. We’ll use the scatterplot of MPG on Weight since we’re more familiar with MPG. Comparison of estimated MPG from two models.8 2.95 = 16. show the fit of both equations in one scatterplot.16 100/7.52 Weight Each increase of Weight by 1 (1000 pounds) reduces estimated MPG by 4.6 .04 gallons/100 miles.1 5.6 – 4.000 pounds (Weight = 2).21 Weight 100 Miles When drawn in the scatterplot of MPG on weight.11 + 1. se = 2.53 100 Miles The estimated mileage is (100 miles)/(3.21 × 2 = 3.
Weight (000 lbs) 2 3 4 5 6 Line Estimated MPG 35. se = 1.
For a visual comparison.5 5 5. the € estimated gallons per 100 miles is Gallons Estimated = 1. Comparing predictions from two models. This table shows several more examples.56 22.
11 + 1. You can see this in Figure 20-8.11 + 1.400 lbs).11 + 1. the green curve gets flatter as the weight increases. 1. 1. and more sensible than the 6.21 × 6. 1. Because of its weight (6. That’s closer to what Car and Driver reports.21 Weight If we take the reciprocal of both sides.000 and 4. The reciprocal equation treats changes in weight differently.85 gallons per 100 miles. Plugging into the linear equation.21 × Weight The difference in estimated MPG between cars that weigh 2.21 × 3 That’s larger than the effect on mileage implied by the linear model.21 × 3 1.8 = 4.4/21/2008
Let’s compare these fits by predicting the mileage of the Hummer H2. To get miles per gallon. write the nonlinear equation differently.000 pounds.11 + 1. The reciprocal equation is Estimated gallons/100 miles = 1.21 × 4 That’s close to the slope in the linear equation. we get 100 miles 1 Estimated = gallon 1.85 = 11. Car and Driver magazine estimates the Hummer to get about 10 MPG.52 MPG. For larger cars.21 × Weight This equation estimates the 100s of miles per gallon.85 gallons per 100 miles converts to 100/8.2 miles per gallon. y €
The predicted 8.0 = 2. multiply by 100 miles 100 € Estimated = gallon 1.8 − 14.3 − 21.4 ≈ 6.000 pounds and cars that weigh 3.11 + 1. the estimated fuel consumption is ˆ = 1.21 × 4 1. the effect falls further to 100 100 − = 16.11 + 1. Differences in weight matter less as cars get heavier.11 + 1. €
Let’s finish our comparison of these fits by thinking about what each has to say about the association between weight and mileage.000 pounds reduces estimated mileage by 4.3 miles per gallon.1 = 7.21 × 2 1. To see why this happens.11 + 1. the estimated mileage for the H2 is ˆ = 35.21 × 5 20-9
. For cars that weigh 3.7 MPG predicted by the linear equation.1 − 16. CAFÉ standards exclude the Hummer and it lacks an official mileage.6 – 4.3 miles per gallon.52 × 6.52 MPG/1000 pounds.11 + 1.8 miles per gallon. The linear equation fixes the slope at -4.7 MPG y Plugging its weight into the reciprocal equation.11 + 1. An increase of 1. the effect falls to 100 100 − = 21.000 pounds is 100 100 € − = 28.4 ≈ 8.
What plot would help you see the problem?1 (b) Which change in temperature has the larger effect on heating costs: a drop from 35 to 25 degrees.8 1.000-pound car improves the mileage by 33% more:
100 100 − = 22.
Are You There?
Estimated Cost = 2330 – 29. as shown in the following scatterplot. mileage can only go down to zero. A residual plot would help.2 × 4. but it’s not going to have much effect on mileage.1 = 1. 1.
. The reciprocal equation tells a different story. Suppose that engineers estimate the benefits of reducing weight by 200 pounds using the linear equation.21 × 3 For a 5.21 × 4. The relationship is straight enough.2 miles per gallon. and the explanatory variable is the average local temperature during the month.
A retail chain operates 65 franchise stores.0 = 0.4/21/2008
The curve is steeper for small cars and flatter for large cars: differences in weight matter more for small cars.9 MPG.2 − 21. but there may be a lurking variable. The response is the cost per month for heating.52 ≈ 0.8 1. It might be easy to trim 200 pounds from a heavy SUV.21 × 2. or a drop from 60 to 50 degrees?2
The fitted line is too low at the left and right of the plot. or about one more mile per gallon.21 × 5 The reciprocal shows the engineers where to focus their efforts to make the most difference. Each store covers about 12. Shaving 200 pounds from a 3.5 Avg Temperature
Figure 20-9. the improvement is much less. € 100 100 − = 14. costs are higher than estimated by the line when the temperature is low or relatively high. however.000-pound SUV.000 square feet.
(a) Explain why the data do not satisfy the straight-enough condition.
Any lurking variable? Before you take a model too seriously. The pattern seems steeper on the left than on the right.11 + 1. The winter heating costs are substantial.11 + 1. Monthly heating costs. ask yourself whether there are any lurking factors.5 miles per gallon.5 − 14.
Do these differences matter? They do if you are an automotive engineer charged with improving mileage. They would conclude that reducing weight brings an average improvement of 0.11 + 1.11 + 1. 1. 2 The drop from 35 to 25. That seems to make more sense than a constant decrease. After all.
but each sale brings less profit.
. The plot should resemble the green curve in Figure 20-8. Timeplots are great for finding trends and identifying dates. That means that price is the explanatory variable and the quantity is the response. Should the relationship be linear? Do you expect an increase of. say.
Figure 20-10. In week 101. 10¢ have the same effect on sales
3 A smooth curve that starts in the upper left corner.32 in week 100 to $0.3
How much should a retailer charge for merchandise? At a high price. For that. But which variable is the response and which is the predictor? To decide. follows the data below the line and then flattens out at the right. the average selling price dropped from $1. not the volume. identify the variable that the retailer controls. each sale brings a large profit.) Before we look at the plot. we need a scatterplot. Shoppers noticed: sales shot up from 29. It’s particularly useful to anticipate if the relationship is linear. For a commodity like pet foods. we expect lower quantities to be sold at higher prices – negative dependence.
The timeplot at the top shows that the price generally increased.000. let’s think about what we expect to see.000 cans in week 100 to 151. (Econ textbooks sometimes draw demand curves the other way around. but economics suggests that fewer items sell than at a lower price. A large sale happened near the end of the data. Timeplots of sales and price over two years. but they’re not so useful for seeing relationships. Low prices generate more volume. The time series in the following figure track weekly sales of a brand of pet food (number of cans) and the average selling price (in dollars) over two years (n = 104).4/21/2008
(c) Sketch a bending pattern that captures the relationship between temperature and cost better than the fitted line. Retailers set the price.
Think First It’s useful to think about what you expect to see in a plot before you draw it.70.
190. These outliers are weeks 101 and 102. This interpretation requires a big extrapolation.10. The pattern has the expected direction: decreasing. on average.3
Figure 20-12.480 – 125.
120000 100000 80000 60000 40000 20000 . Suppose we compare weeks in which the average selling price differs by $0. the chain sells 12.00 1.
. More sales at lower prices. The relationship between sales volume and price fails the straightenough condition. The intercept.10 1.2 1.1 1.7 .30
Figure 20-11. when stores drastically cut the price.519 more cans on average in the week with the lower price. is also a huge extrapolation. a range of only $0.30.9 1 1. Prices range from $0.
30000 20000 10000 0 -10000 .8 . Higher prices are associated with smaller quantities sold.
The two outliers at the left show the big increase in sales at low prices.60.20? It does if the relationship is linear.70 to $1. The slope implies that.
Scatterplots and Residual Plots
This scatterplot graphs the number of cans sold on the average selling price. A better interpretation of the slope uses a smaller price difference and avoids suggesting causality.70 as when the price is $1. The slope indicates that the chain sells 125.90 1.70 .4/21/2008
when the price is $0. The nonlinear pattern is more distinct in the residual plot.190 more cans on average if it reduces the selling price by $1 per can.190 Price Let’s interpret the estimated slope and intercept. Let’s see what the residuals reveal. The scatterplot includes the least squares line: Estimated Sales Volume = 190.20 1. The White Space Rule suggests that we might be missing something in Figure 20-11.480 cans.80 .
05 -0. Like a stopped clock.955).25 0.3
Log Avg Price
Log Sales Volume
.1 0 . particularly when compared to those from the linear equation (Figure 20-12).
0. The log-log equation captures the curvature that is missed by the linear equation. Residuals on the log scale show no remaining pattern.442 Log Price
Log Avg Price
Figure 20-13. but only on average.1
. This relationship appears straight.
To compare these descriptions of the relationship between price and quantity.4 -0.15 -0.10 -0. The next scatterplot shows the natural log (log to base e) of sales volume versus the natural log of price.15 0. The line gets the direction right.2
.2 . The log-log equation estimates the log of the sales 20-13
.2 -0.20 0.2 -0.
With the data packed so tightly around the line (r2 = 0. The residuals are positive at both low and high prices and negative in the middle.4 -0. The fitted equation underpredicts sales at both low and high prices.00 -0.10 0. the residuals remain near the fitted line. we will start with a visual comparison. Even during periods of very low prices.
The data cluster on the right of the plot to accommodate the weeks with large price discounts.05 0. The following table shows the details for estimating sales with the linear equation and the log-log equation.3 -0. the White Space Rule suggests that we look at the residuals. it accurately estimates sales at two prices but otherwise misses the target. Let’s compare the fit of these models in the plot of quantity versus price.5
Estimated Log Sales Volume = 11.4/21/2008
The bending pattern is evident in the residual plot.1 . In between. Quantity and price transformed to a log scale.3
-0.05 – 2. These residuals look okay.20 -0. it overpredicts sales.5
The slope of the log-log equation changes.
120000 100000 80000 60000 40000 20000 . one tenth of the slope.442× log(0. customers are price-sensitive. At low prices.9 1 1.050) = 62.290 52.3% increase in the price and has a smaller impact on sales.8 .500 cans.307 exp(11.90 comes with a drop of more than 27. We can explain this effect nicely using percentages. customers who are willing to buy at $1.861 10.
Most economists wouldn’t wait to see a residual plot when building a model that relates price and quantity.000 in the estimated volume. a 10¢ increase is a 12. Comparing the models for sales versus price.328 77. The following scatterplot shows the estimated sales from both equations.809 65. To convert these estimates to the quantity scale.7 .00 1.817 exp(10. Log scale Estimated Sales 11.
The slope of the linear equation implies that each 10¢ increase in price decreases sales volume by 12.817) = 49.10 1.80 = 90. The 10¢ increase from $1.80 to $0.8) = 11.20 Linear Equation Estimated Sales 190480–125190×0. A 10¢ increase in price is a larger percentage change at lower prices.80.70.605 exp(10.336
Table 20-2.594)=108.2 1.05-2.90 1. In contrast. that same 10¢ increase is an 8. The log-log equation (green) captures the bend that is missed by the fitted line (red). The log-log equation implies the effect of changing the price depends on the price. At $0.445 11.389 11. we have to take the exponential of the fitted value using the function exp(x) = ex.10 are less sensitive to the price than those who buy at $0.252 Log-Log Equation Estimated Sales. At $1.3
Figure 20-15.20.307) = 81. They would have begun with both variables on a log scale.519. Changes in price have a larger effect on sales volume when the price is low than when the price is high. The increase from $0.
Average Price ($) 0. There’s an important reason: logs capture how 20-14
. Predicting sales volume with the linear equation and the log-log equation.10 to $1.771 40.4/21/2008
volume.050 exp(11.594 exp(11.605) = 40.20 comes with a drop of about 9.5% increase.1 1. The curve gets flatter as the price increases. Each 10¢ increase in Table 20-2 comes with a different decrease in sales.80 0.944 10.
each half-gallon of orange juice costs c = $1 to purchase and stock. If the retailer raises the price by 5% to $1.
Example 20. then we’d expect sales volume to fall by 5 × -2. 108. Those big sales might have moved a lot of inventory. a 1% increase in price brings a 2. when we compare sales in weeks with prices that differ by 1%. estimated sales are 60.60 to stock and sell each can. the estimated sales are almost double. that only nets $21.
percentage changes relate price and quantity.689.05 more. Logs change how we think about variation.02. on average. but not very intercept in a log-log model is important for calculating y interpretable substantively. the retailer has been charging about $1 per can. The slope in an equation that transforms both x and y using logs is known as the elasticity of y with respect to x.20 profit per can.016
Most of the time. At $0. The elasticity describes how small percentage changes in x are associated with percentage changes in y (see Under the Hood: Logs and Percentage Change). Variation on a log scale amounts to variation in percentage differences. the slope in the log-log model b1 = -2. with typical sales around 60.1 Optimal Pricing
state the question
How much should a convenience store charge for a half-gallon of orange juice? Economics tells us that if orange juice costs c dollars per carton and γ is the elasticity of sales with respect to price. That’s about 0.189. If the store spends $0.2%.44% decrease in sales. each sale earns $0.4/21/2008
elasticity Relates % change in x to % change in y.551. We can estimate the elasticity from a regression of the log of sales on the log of price. In the pet food example.80 (Table 20-2).44 tells us that.000 = 7.000 cans per week.445 at $0.122 × 60.52 and profits are $25. but each of those that do sell bring in $0. These cans usually sell for about $1.44% = -12. then the price that maximizes profits is (see Under the Hood: Optimal Pricing) cost × elasticity Optimal price = elasticity + 1
= 0. the quantity € sold during weeks with the lower price is 2.320 fewer cans. We can find the best price by using the elasticity in a formula that includes the cost to the grocery.442
−2. At€ the much lower price offered in the discount weeks. All we need to find the best price is the elasticity. slope is a log-log regression equation. however. The ˆ .442 + 1 = $1. At $1.05. What should we tell the retailer? On average. but they sacrificed profits. For this store.44% higher than during weeks at the higher price. 20-15
.60 −2. then the optimal price to charge is c γ/(1+γ).
do the analysis
The least squares regression of log sales on log price is Estimated Log(Sales) = 4. The chain collected data on sales of orange juice over a weekend at 50 locations. say. The stores are in similar neighborhoods and have comparable levels of business.5 4 4. Both are transformed using logs. and the response will be the sales volume.80 per stores.2 1.5 2 1.5 4
150 125 100
3.5 .2 . Sales would get higher if the chain were to 20-16
Straight enough.6 . Management made sure that the price varied among stores.25 per half gallon.5 2 2. At $3. The residual plot shows no evident pattern.75)/-0. Others only buy the orange juice if the price is low enough. The relationship is clearly not linear on the original scale.75 = $2.6
75 50 25 0 1 1.4 . The slope in this equation will estimate the elasticity of sales with respect to price.
The orange juice costs the chain $1 to stock and sell.
summarize the results
Some people will buy orange juice at a high price. with all selling orange juice for between $1. sales grow to 28 cartons.1. The optimal pricing formula indicates that the best price is $1(-1.
The explanatory variable will be the price charged per half-gallon.25 to $4. Check straight-enough condition.8 1 1. selling 18 cartons earns 18($2) = $36 profit.81 . earning 28 × 1.35. At $2.5 3 3. The scatterplot of log sales volume on log of price appears linear.33. we cannot fit a line or curve. Describe data. If all of the prices are the same.
20 Curves describe the data and select an approach
Identify x and y. Link b0 and b1 to problem.5 3 2.35 = $37.75 Log(Price) Simple enough. Notice that we have to make sure that these stores sell juice at different prices.
33. we predict sales of about 28 cartons at a profit of 28 × 1. Logs probably are not a good choice.80 per store.35 = $37.75. the increased volume would generate more profit. however.75%. The typical store sells 18 cartons at $3. reduces sales by about 1. on average. Elasticity might not be a good choice either. and may not feasible if stocking more orange juice will lead to greater storage costs.35. Use words or terminology that your audience understands. The elasticity of sales with respect to price is –1. but it’s easy to explain. Even though it makes less on each. This increase in profits is small. This means that each 1% increase in price. At $2.4/21/2008
lower the price. The chain would make higher profits by decreasing the price from the current $3 to near $2. giving profits of 18×($3-$1) = $36 per store.
5 4 4. the slope in a log-log model is an elasticity. They don’t notice logs.5
Most hasty readers will look at the plot on the left and think that price has a constant effect on sales. Interpret the slope carefully.8 1 1. Nonlinear equations that use these transformations allow the slope to change. Stick to models you can understand and interpret.2 .2 1. such as in demand curves. A linear pattern implies that changes in the explanatory variable have the same effect on the response.5 3 3. telling you how % changes in x are associated with % changes in y.5 2 2. such as logs.5 2 1. few will realize that it means that sales of this product are price sensitive. When 20-18
Regression equations use transformations of the original variables to describe patterns that curve. Slopes in models that use logs or other transformations mean different things than slopes in a linear model. but not always the best summary of dependence.6
75 50 25 0 1 1.
5 4. the effect of the predictor depends on its size. but are so common in economic modeling.5 3 2. The slope in a regression model that uses logs of both the response and explanatory variable is the elasticity of the response with respect to price. Often. 20-6
Check that a line summarizes the relationship between the explanatory variable and response both visually and substantively.4 . 20-15.4 1. In particular. implying that the effect of the relationship between the explanatory variable on the response depends on the value of the explanatory variable. Lines are common and simple. Logs are harder to motivate. Show your model in the original units.6 . to be worth considering. Equations with reciprocals and logs are useful in many business applications and are readily interpretable. regardless of the size of the predictor variable.
Index Best Practices
150 125 100
3. If you show most audiences a scatterplot after a transformation. 20-21
transformation.5 . Reciprocals are a natural transformation if your data is already a ratio. particularly in economic problems.
45 0. It’s easy to get caught up in figuring out the right transformation and forget other factors.. For instance. to think of the effect of price on quantity in percentage terms. To see the curvature. for example. so did the competition. once we use the right transformation. Modeling data that you don’t understand. it’s immediate that the effect of price increases diminishes at higher prices. It’s more natural. That’s because. the response may be very different. Another consequence of not understanding the data is forgetting that these equations describe association. we do fit straight lines. Using too many transformations. you’re not going to find curves that useful. Forgetting lurking variables.
. Equations like this one used by NASA to study the space shuttle (Chapter 1) don’t always infuse audiences with confidence that you know what you’re doing. Perhaps every time in the past when the retailer raised prices.
shown on the original scales. Unless that matching continues the next time the price changes.27 ( d) ρF (V − V * ) 2/3 1/ 4 1/ 6 ST ρT
Unless you can make sense of your model in terms that others appreciate. Comparing r2 in models with different responses. it makes no sense to compare the proportion of variation in the log of quantity to the proportion of variation in the quantity itself. Unless we can rule out other factors. Missing the bend.0195 ( L / d )
0. It’s common to think that regression equations only fit straight lines. It’s often easier to see deviations from a straight line in the residual plot than in the scatterplot of the original data. you have to return to the original scales. Examine the residuals. we don’t know that changing the price of a product caused the quantity sold to change.
• Thinking that regression only fits straight lines. You can only compare r2 between models with the same response. not causation. Consumers always saw the same difference in prices. One of the keys to finding curvature is understanding the context of your data. Plots of marginal revenue or costs are typically nonlinear. You won’t discover the curvature unless you look for it. and pay special attention to the most extreme residuals because they may have something to add to the story told by the linear model.
multiply x by 1.
. Denote the estimate of y at x by y ˆ (x+1) = b0+b1 (x+1). Instead of To interpret a log-log looking for what to expect when x grows by 1. the predicted value becomes € ˆ (1. think in terms what happens € when € x grows by 1%.01x ) − y ˆ ( x )) y = log ˆ ( x) y
y ˆ (1. The predicted value for the log of y in the log-log equation at x is ˆ ( x ) = b0 + b1 log x log y
Rather than increase x by adding 1.01b1 ≈ log y
The last step uses a simple but useful approximation: log (1 + little bit) ≈ little bit € This approximation is accurate so long the little bit is less than 0.01x ) y = log ˆ ( x) y ˆ ( x) + ( y ˆ (1.4/21/2008
Under the Hood: Logs and Percentage Changes
The slope in a linear equation indicates how the estimate of y changes ˆ (x) = b0+b1 x.01x ) − y ˆ ( x) y ˆ b1 ≈ 100 = Percentage change in y ˆ ( x) y € The slope b1 in a log-log model tells you how percentage changes in x (up to about ±10%) translate into percentage changes in y.01b1 = log y ˆ (1.01x ) − log y ˆ ( x) 0.01x ) − y ˆ ( x) = log1 + little ˆ ( x) y bit ˆ (1.01) ≈ 0. If we rearrange the previous equation and remember that log a – log b = log (a/b).01x ) = b0 + b1 log(1. The slope is the the estimate of y at x+1 by y difference between these estimates:
ˆ (x+1) – y ˆ (x) = (b0 + b1 (x + 1))€ y – (b0 + b1 x) = b1 € equation. the approximation means log(1.01x ) − y ˆ ( x) y ≈ ˆ ( x) y Change x by 1% and the fitted value changes by ˆ (1.01 to increase it by 1%.01 = log y ˆ ( x ) + 0. and when x grows by 1. In this example.01 x ) log y
= b0 + b1 log x + b1 log 1. think in terms of percentages. If x increases by 1%.01 ˆ ( x ) + b1 log 1.01.01) = log(1 + 0.10. then the estimated change that comes with a 1% increase in x is ˆ (1.
The theory constrains the exponent γ < -1. There are two general approaches to fitting curves. In some cases. you need to first build two new columns. you proceed to build a scatterplot of these two transformed columns. That is. say. The pricing data also comes from an MBA student who worked with a retailer during her summer internship.
About the Data
A former Wharton MBA student worked in the auto industry and provided the data in this chapter that describe characteristics of cars.) However. The symbol m stands for a constant that depends. the menu sequence 20-21
. the optimal price soars (tempting governments to regulate prices). economics describes how the quantity sold Q depends on the price p according to an equation like γ this one. Rather than transform.50. see Figure 20-8. We don’t want to maximize sales – it’s profits that matter. The mileage data is the standard EPA estimate for urban driving. for example. Then. the optimal price is $1. Fitting the line in the scatterplot of log Y on log X gives the equation for the curve in the original units. (For example. both X and Y to a log scale. Suppose each item costs the seller c dollars. on whether we’re measuring sales in cans or cases. one holding the logs of Y and the other holding logs of X. Then the profit at the selling price p is Profit(p) = Q(p) (p – c) Calculus shows that the price that maximizes the profit is p* = c γ/(1 + γ) For example. The exponent γ is the elasticity of the quantity sold with respect to price. Q(p) = m p . you can add the curve that summarizes the fit of log Y on log X directly to the scatterplot of Y on X. most software instead forces you to transform the data. start with the scatterplot of Y on X. If customers are more price sensitive and γ = -3. then the optimal price is c γ/(1 + γ) = $1 × -2/(1-2) = $2. if each item costs c = $1 and the elasticity γ = -2. you can add the curve itself to a scatterplot of the actual data. In the so-called monopolist situation. To help you decide if a transformation is necessary. As the elasticity approaches –1.4/21/2008
Under the Hood: Optimal Pricing
Choosing the price that earns the most profit requires a bit of economic analysis. We modified the data slightly to protect the source of the data. With the chart selected. Highway mileage produces similar results.
Fill in variables for the response and explanatory variable. Try several and decide which fit provides the best summary of the data.e. For example. If you would like to see the associated linear fit to the transformed data. In the window that shows the scatterplot. click on the red triangle above the scatterplot (near the words “Bivariate Fit of …”). Select a transformation for Y or X and JMP will estimate the chosen curve by least squares. Then follow the methods in the prior chapter to fit a line to the transformed data. you’ll need to use the JMP formula calculator to build new columns (new variables).
If the scatterplot of Y on X shows a bending pattern. The menu sequence
Analyze > Fit Y by X opens the dialog used to construct a scatterplot. use the calculator (menu sequence Calc > Calculator…) to construct columns of transformed data (i. choose the item Fit Special to fit curves to the shown data. Then follow the methods in the prior chapter to fit a line to the transformed data. right-clicking in the header of an empty column in your spreadsheet allows you to define the column using a formula such as Log(X). If you believe that the association between Y and X bends. build new variables such as 1/X or log Y). and add a summary of the equation to the output below the plot.4/21/2008
Chart > Add Trendline… allows you to fit several curves to your data (as well as a line). use built-in functions to compute columns of transformed data (for example. In the pop-up menu.. then click OK.
. add the fit to the scatterplot. make a column that is equal to the LOG or reciprocal of another).