You are on page 1of 3

1 Excel Manual

Chapter
Multiple Regression 14

This chapter takes regression analysis to the next level. Here, we’ll analyze regressions involving
multiple independent variables using special Excel array functions. We’ll also demonstrate how
to create a residual plot to illustrate the appropriateness of a regression model.

Multiple Linear Regression


This example uses the data from Example 14-1 on page 618 in the online-only portion of the
text. Enter the data as shown in figure below:

To obtain the multiple regression analysis in Excel, the LINEST function is used. This function is
an array function that requires a special method of entry, otherwise you will only get a single
piece of information.

Create an area to the right of the data for this analysis, similar to the following illustration:

This next step must be done correctly or errors will result. Make sure to follow these steps:

1. Select F3:H7
2. Enter the following formula: =LINEST(C2:C14,A2:B14,TRUE,TRUE)
3. Hold <shift><control> and press <enter>

Chapter 14/ Introductory Statistics - Mann


2 Excel Manual

This will yield the values shown in the sheet below. Notice that there are errors in column H;
these may be ignored. Since it is an array formula, the errors cannot be deleted, which is why
the labels are in column K.

This analysis calculates the F statistics instead of the T statistic. To calculate the T-statistic
instead of the F, include the fraction coefficient/standard_error for each coefficient in the
analysis. The formulas and results of this are shown below:

The full array function is only shown in column F, since the G and H formulas will look exactly
the same.

If you select click on cell F3 after it is entered, it will appear in the data entry bar with curly
braces. Excel uses these braces to display array functions.

Chapter 14/ Introductory Statistics - Mann


3 Excel Manual

Residual Plots to Check for the Adequacy of a Statistical Model


To create a residual plot in Excel, you have two options. If you are using the Data Analysis
ToolPak, select the residual plot as an option. This will create a scatter diagram automatically. If
you are using the spreadsheet to do your calculations, create two extra columns in your
spreadsheet. The first will hold the predicted values using the regression equation as a formula
in column D. The second will contain the residuals, which are the difference between the
observed and the predicted. A scatterplot, as described in previous chapters, may be made for
the actual vs. the predicted.

Using the regression analysis from Example 14-1 on page 618, which evaluates the relationship
between monthly premiums for car insurance with driving experience and driving violations, we
inserted the columns with the formulas shown to calculate the predictions and residuals:

The scatter diagram of the predicted vs. residuals—built using Layout 4 in the Quick Layouts of
the Design tab—is pictured below:

Chapter 14/ Introductory Statistics - Mann

You might also like