Professional Documents
Culture Documents
Can we predict the future Coca Cola price based on the past data?
1. Raw Data: 1920 $0.05, 1980 $0.25, 1985 $0.65, 1990 $0.87, 2000 $0.1.25, 2014 $2.00
2. Input Data
[After the data input, name the file for the case you use the data file, later, again. (Click “file”
menu and use “save” option to name your file and the directory you want to put the file in.)]
2
◼ Moving columns
◼ Selecting a range
◼ Calculating “average”
(a) Select the range of data you want to draw a chart (Use the mouse)
(b) Click “Insert”. Then Click “Recommended Chart”. Click OK (for the Scatter Chart).
Price
2.5
1.5
0.5
0
4. Analysis of Correlation 1900 1920 1940 1960 1980 2000 2020
4. Calculating CORRELATION
(b) Click “Data Analysis (Add-in) at the right end of the list of sub-menu.
(c) Box (of alternative data analysis methods) will pop up. Choose Correlation.
(e) Then, select the range of relevant data of both dependent variable (Price) and the
independent variable (Year). You may use mouse, or, you may type in $A$1:$B$7 to designate
the relevant range.
(f) Click “Labels in first row” if you included the label in your selected data range.
(b) Again select “data” in the top menu bar. Then, select “data analysis” tool pack.
(c) The data analysis menu box will pop up. Scroll down to find the regression option. Choose
the regression option (OK).
4
(d) Select the range of your dependent variable (criterion, target, endogenous variable). The
variable you want to make prediction. In this case “Price.”
(e) Select the range of your independent variable (explanatory, predictor, controlled,
exogenous variable) . The variable you want to rely on for the prediction, i.e., the predictor. In
this case, it is “Year.”
6. Regression Output
(a) We got the output of regression analysis. However, it looks a little messy.
- In the output, Lower 95% and Upper 95% appear twice (Column F&G and Column H&I).
- I will delete one of them, i.e., column H & I. Select the column H and I using your mouse.
Click the right button of your mouse. Select “delete” menu and delete the columns H and I.
- Expand the column A to have the size you want. All other columns will be expanded to have
the same size, too, automatically. => You get the following result.
=> The higher the value (closer to 1.0), the model fits the data very well.
=> We may say, the linear model reasonably fits the data.
7
(3) p-value of the independent variable (Year) is .046 < .05. (t-value is 2.849)
=> The independent variable (Year) is significantly related to the dependent variable,
i.e., Price. Year is related to Coke Price. How?
=> As “Year” (IV) increases by one unit, i.e., increase of one year,
Every year, Coke price increases by $0.018 on the average. (During 1920~2014).
- Type “=”
- “Enter”
- I added Year 2020 to see what will be the Coke price in 2020.
8
- Copy C2 to C3~C8. (Select C2. Right click your mouse. Select the “copy” option. => Then
select C3 to C8 with your mouse. Then, hit “Enter”.)
- We see that the estimated prices do not match the real price very well.
- The prediction for the 2020 price is 1.715, even lower than the price in 2014.
(May be the data of 1920 does not reflect the current trend of the Coke price. At that time,
Coke price has been fixed from 1886.)
(b) Try to get correlation. What is the correlation score? Do you get 0.995?
(f) What is the predicted price for 2020? Predicted price is $2.268
=> If we use more data points, the prediction is likely to get better.
Q&A
(4) Interpretation
10
As usual, the coefficient b1 is the effect when Weight is increased by one unit.
It is the increase in Weight when the Height becomes taller by one unit (one inch).
Therefore, b2 is the difference in height on the average between Male and Female.
(3) What will be the weight for a woman with height of 70 inches tall?