3 views

Uploaded by noelje

Attribution Non-Commercial (BY-NC)

- MATH SL Internal Assessment (IA) 2015 Correlation
- Regression With SAS
- Random Wave Forces on a Free-To-surge Vertical Cylinder
- spss ppt
- A Contingency Model of Perceived Effectiveness in Accountin
- xyApr6Lec26
- Course Outlines_statistics
- Simple linear regression analysis
- research report on e-banking and its effect
- An empirical review of Motivation as a Constituent to Employees' Retention
- Analysis of Cost Overrun in Highway Construction Projects using Multiple Regression and Artificial Neural Networks
- Farmer's Adviser System
- ma_chap2
- Investigation of the Relationship between Diesel Fuel Properties and Emissions from Engines with Fuzzy Linear Regression
- Chapter 8 Solution
- Statistics
- ASTM - G16
- Simple Linear Regression Part 1
- Course Outline
- Impact of Employee’s Satisfaction With Performance Appraisal

You are on page 1of 8

Taylor

The purpose of this guide is to explore linear regression using Excel. This note consists of the following sections: Summarising and describing a multi-variable data set Correlation analysis Scatter plots Simple regression Multiple regression Excels regression functions Exercises

First, check that Excels statistical add-in, Data Analysis, is attached to Excel. From the set of tabs at the top of your screen, click on the Data tab. If Data Analysis is attached, it will be available as an option in the Analysis group towards the top right of your screen.

If Data Analysis is not one of the options, you need to attach it by working through the following steps: (i) Click on the File tab, near the top left of the screen, and then select Options. (ii) Click Add-Ins (on the left of the screen), and then in the Manage box (at the bottom of the screen), select Excel Add-ins. If this is not one of the options in the dialog box, you need to install the add-ins from your Microsoft Excel installation disc. (iii) Click Go. (iv) In the Add-Ins box (shown on the right here), select three add-ins: Analysis ToolPak, Analysis ToolPak VBA, and Solver. Then click OK. If you get prompted that the Analysis ToolPak Add-in is not currently installed on your computer, click Yes to install it. (v) After you load these add-ins, the Solver and Analysis ToolPak commands are available in the Analysis group on the Data tab.

1. SUMMARISING & DESCRIBING A MULTI-VARIABLE DATA SET The Excel file ElectricityConsumption.xls contains monthly observations from January 2004 to July 2012 for the following variables:

ELEC C66 C76 H55 DINC AIRC Residential electricity sales (KWh) per customer in a mid-Atlantic U.S. city Cooling degree hours at base temperature 66 degrees (a measure of summer heat)1 Cooling degree hours at base temperature 76 degrees (a measure of summer heat) Heating degree hours at base temperature 55 degrees (a measure of winter cold)2 Disposable income per household ($) Proportion of households with air conditioning

The ultimate aim is to build a forecasting model for residential electricity consumption.

1 2 3 4 5 6 7 8 9 10 11 12 13 A MONTH Jan-04 Feb-04 Mar-04 Apr-04 May-04 Jun-04 Jul-04 Aug-04 Sep-04 Oct-04 Nov-04 Dec-04 B ELEC 681.7 620.3 590.8 538.0 513.4 575.5 1019.3 1203.9 1176.7 723.0 519.0 604.9 C C66 20 0 20 14 559 1601 5348 7416 6887 2975 427 9 D C76 0 0 0 0 3 83 833 1547 1287 398 5 0 E H55 10148 12504 9300 5333 2846 282 1 0 0 155 1812 5779 F DINC 34825 34934 35050 35172 35302 35438 35583 35734 35892 36056 36222 36391 G AIRC 0.698 0.701 0.705 0.708 0.712 0.716 0.72 0.724 0.728 0.731 0.735 0.739

Use the Analysis Toolpak Descriptive Statistics tool to get summary statistics (in one sequence of operations) for all 6 variables, by selecting From the main Excel menu, click on the Data tab From the Analysis group, select Data Analysis In the resulting dialog box, select Descriptive Statistics

In the Descriptive Statistics dialog box, specify: Input Range as the range containing values and variable names: B1:G104 Click the Labels in First Row checkbox Output options as New Worksheet Ply with the name Summary Click the Summary Statistics checkbox.

in

i 1 i 1

in

2. CORRELATION ANALYSIS

Return to the Data worksheet From the main Excel menu, click on the Data tab From the Analysis group, select Data Analysis In the resulting dialog box, select Correlation

In the Correlation dialog box, specify: Input Range: as B1:G104 (dont include the house number column) Grouped By: as Columns, so that Excel knows that each column is a variable. The Labels in First Row checkbox should be crossed Output options: as New Worksheet Ply with the name Correlations Click OK.

The correlation matrix below should result. Correlation coefficients for pairs of variables indicate the levels of linear association between them, e.g. ELEC and C76 have correlation of 0.94, so that as C76 rises, ELEC rises.

ELEC 1.00 0.92 0.94 -0.36 0.14 0.14 C66 0.92 1.00 0.95 -0.65 0.02 0.02 C76 0.94 0.95 1.00 -0.52 0.01 0.01 H55 -0.36 -0.65 -0.52 1.00 -0.04 -0.05 DINC 0.14 0.02 0.01 -0.04 1.00 0.94 AIRC 0.14 0.02 0.01 -0.05 0.94 1.00

You should get the same value using the Excel function =CORREL Note any variables strongly correlated with ELEC, and any strong inter-correlations between the potential explanatory variables, C66, C76, H55, DINC and AIRC.

3. SCATTER PLOTS Scatter plots are of great help in identifying the strength, nature and direction of relationships between pairs of variables. In particular, they can highlight non-linear relationships, which will not necessarily be apparent from the correlation values. Since the observed correlation, 0.94, between ELEC and C76 suggests a relationship, lets examine their scatter plot. Return to the Data worksheet. Copy the ELEC column of data to column K. Copy C76 to column J. Highlight the new C76 and ELEC columns (columns J and K), as shown in the screen dump below.

From the main Excel menu, click on the Insert tab. From the Charts group, select Scatter with no lines as highlighted above.

ElectricityConsumption

Dealing with charts is somewhat cumbersome in Excel 2010. A simple way to insert axis titles and chart titles is to use Excels text box option, which is also highlighted in the screen dump above. After a little work, the chart can look something like this. The scatter plot confirms the moderate strength, linear relationship, with ELEC increasing as C76 increases.

1600.0 1400.0 1200.0 1000.0 800.0 600.0 400.0 200.0 0.0 0 500 1000 1500 2000

ELEC

4. SIMPLE REGRESSION Regression analysis produces the estimated linear equation that best fits a set of data. By best fitting we mean the line (or linear model) for which there is least residual scatter. Return to the Data worksheet From the main Excel menu, click on the Data tab From the Analysis group, select Data Analysis In the resulting dialog box, select Regression

Complete the regression dialog box by specifying: Input Y range as B1:B104 Input X range as D1:D104 ELEC as dependent variable C76 as independent variable

Check the Labels box as the first entries in each cell range are labels Specify Output options as New Worksheet Ply, with the name Regression1. Under the heading Residuals, select Residuals, Residual Plots & Line Fit Plots. Then click OK.

SUMMARY OUTPUT Regression Statistics Multiple R 0.936601141 R Square 0.877221698 Adjusted R Squ 0.876006071 Standard Error 84.01563552 Observations 103 ANOVA df Regression Residual Total 1 101 102 SS 5093652.918 712921.3281 5806574.246 MS 5093652.918 7058.627011 F Significance F 721.6209201 8.45675E-48

Intercept C76

The 1st part of the output contains summary statistics for the regression as a whole, R and residual standard deviation (called standard error). Ignore the 2nd part which displays ANOVA or Analysis of Variance calculations. The 3rd part of the output indicates that the best fitting linear model has equation: ELEC = 632.20 + 0.538*C76 And that the slope, 0.538, has a t-stat of 26.86 and a very small p-value. The variable C76 is therefore significantly explaining some of the variation in ELEC. The 4th part shows predicted values for each of the observations, and the residuals.

4.2 REGRESSION - INTERPRETING EXCELS GRAPHICAL OUTPUT The Regression tool puts one chart on top of another. Click on the top chart so that it becomes the active chart, and then move it down. The Line Fit Plot shows actual ELEC and predicted ELEC, plotted for different values of C76. The regression line (called Predicted ELEC in the legend) is shown as points rather than as a line. This can be changed by formatting the points.

C76LineFitPlot

2000.0 1500.0

ELEC

ELEC PredictedELEC

Residuals Plot shows residuals plotted versus the value of the C76 variable. Check that the residuals do not display an obvious pattern. Ideally, residuals should be as if random, not showing any systematic pattern, of much the same average size, and not increasing in size as X (ELEC) increases, etc. Residual plots are also useful for spotting outliers.

C76ResidualPlot

300 200

Residuals

5. MULTIPLE REGRESSION Can the ELEC predictions be improved if other possible explanatory variables are brought into the model? This section contains a brief description of the way Excels regression can be extended from simple (ELEC on C76) to multiple regression (ELEC on two or more variables). The purpose is to find the best equation for predicting ELEC from one or more of the independent variables. Lets regress ELEC on the other five variables. Return to the Data worksheet From the main Excel menu, click on the Data tab From the Analysis group, select Data Analysis In the resulting dialog box, select Regression In the Regression dialog box, specify: Input Y range as B1:B104 i.e. ELEC as dependent variable Input X range as C1:G104 i.e. five explanatory variables Check the Labels checkbox. Specify Output options: as New Worksheet Ply, with the name Regression2. Under the heading Residuals, select Residuals, Residual Plots & Line Fit Plots. Then click OK. 7

- MATH SL Internal Assessment (IA) 2015 CorrelationUploaded byAnggiat Bright Sitorus
- Regression With SASUploaded byAkshay Mathur
- Random Wave Forces on a Free-To-surge Vertical CylinderUploaded byThangiPandian1985
- spss pptUploaded bychandanprakash30
- A Contingency Model of Perceived Effectiveness in AccountinUploaded byekea multi
- xyApr6Lec26Uploaded byIngga Permana
- Course Outlines_statisticsUploaded bySmriti Salhotra
- Simple linear regression analysisUploaded byJoses Jenish Smart
- research report on e-banking and its effectUploaded byPrachi Tiwari
- An empirical review of Motivation as a Constituent to Employees' RetentionUploaded byinventy
- Analysis of Cost Overrun in Highway Construction Projects using Multiple Regression and Artificial Neural NetworksUploaded byIRJET Journal
- Farmer's Adviser SystemUploaded byGRD Journals
- ma_chap2Uploaded bycesardako
- Investigation of the Relationship between Diesel Fuel Properties and Emissions from Engines with Fuzzy Linear RegressionUploaded bySEP-Publisher
- Chapter 8 SolutionUploaded byRonel Mendoza
- StatisticsUploaded bymanjinderchabba
- ASTM - G16Uploaded byNorbey Arias
- Simple Linear Regression Part 1Uploaded by_vanityk
- Course OutlineUploaded bymuralidharan
- Impact of Employee’s Satisfaction With Performance AppraisalUploaded by_tijana_
- An Analysis of Tourism Competitiveness Index of Europe and Caucasus: A Study on the Regional Rank of the Tourism Competitiveness IndexUploaded byjournal
- CHAPTER 8 SIMPLE LINEAR REGRESSIONUploaded byNur Iffatin
- Simple Linier Regression ModelUploaded byMedico Nol Delaphan
- LifestyleUploaded byuploader12345
- Lreier regaredsinn.pptxUploaded bySunitha Kishore
- Chapter11-Econometrics-SpecificationerrorAnalysisUploaded byAbdullah Khatib
- 2008 stat exam.pdfUploaded byElle Smart
- 2317112Uploaded byRafinkanisa Witarayoga
- OutputUploaded byAbudzar Ghifari
- Fundamentals of Statistics - UploadUploaded byipconfigearth

- Business Model Canvas PosterUploaded byosterwalder
- Practice+Questions+for+Lecture+5Uploaded bynoelje
- Practice+Questions+for+Lectures+1 4Uploaded bynoelje
- Decision Trees Using TreeplanUploaded byGowtham Bharatwaj
- International Migration and Its Downturn, Assessing the Impact of the Global Financial Downturn.annotatedUploaded bynoelje
- Hbrussia SnapshotUploaded bynoelje
- How to Install AddinUploaded bynoelje
- DDA2013 Week8 WindowsExcel2003 OptimisationUploaded bynoelje
- US Opportunities in Russian HealthcareUploaded bynoelje
- Shortage of Workforce and ImmigrationUploaded bynoelje
- DDA2013 Week7 WindowsExcel2003 RiskAnalysisUploaded bynoelje
- DDA2013 Week4 WindowsExcel2003 RegressionUploaded bynoelje
- Pricing Reimbursment in Brazil and RussiaUploaded bynoelje
- DUO Russia PrivateHealthUploaded bynoelje
- DDA2013 Week2 WindowsExcel2003 StatsIntroUploaded bynoelje
- DDA2013 Week0 WindowsExcel2003 BeginnersUploaded bynoelje
- DDA2013 Week3 0 ContentsUploaded bynoelje
- JLP Report and Accounts 2013Uploaded bynoelje
- Ejkm Volume5 Issue4 Article137Uploaded bynoelje
- Competitive AdvantageUploaded byPistoph Sexu
- Limits to GrowthUploaded bynoelje
- The Demographic Effects of International Migration in Europe.annotatedUploaded bynoelje
- Future Demographic Change and Its Interactions With Migration and Climate Change.annotatedUploaded bynoelje
- Demographic Change and Regional Competitiveness- The Effects of Immigration and AgeingUploaded bynoelje
- Migration and Intergenerational Replacement in EuropeUploaded bynoelje
- Of Brain Drain and Policy ResponsesUploaded bynoelje
- Europe’s Migration Agreements towards southUploaded bynoelje
- Demographic BusinessUploaded bynoelje

- sightandtouch00abbouoftUploaded byCarla Putz
- 13 TermodinamikaUploaded byMiura
- Black Line Pro Program 020808Uploaded byDrshahril Zulkarnain
- Shell and tube heat exchanger typesUploaded byAhmed
- Saving Energy With Cooling TowersUploaded byPatrick
- 2015waterlab MeUploaded byAraceli Davila-pelayo
- Energyexergyandthermoeconomicanalysisoftheeffectsoffossilfuelsuperheatinginnuclearpowerplant.pdfUploaded byuapaza
- WCF Rest Part 1 of 12Uploaded bypradsin
- Tectonic Evolution of Tripura-Mizoram Fold Belt 1983Uploaded bySujit Dasgupta
- Oil PropertiesUploaded byMuhammad Nursalam
- 2Uploaded bySharma Sudhir
- 7 Swift Vol2 HvacUploaded bydin1978
- SG05_ApplicationsForceUploaded byafkguy
- Technical Specification Wet LabUploaded bymarfan75
- Data Structures Algorithms U1Uploaded byDavindran Kumaar Ananthan
- Cyclotron Assignment 2013Uploaded byElz
- Comparison of t vs. Wilcoxon Signed Rank TestUploaded byMaricen Reyes
- 17630 Model Answer Winter 2015Uploaded byVivek Malwade
- Physics (Pre)Uploaded byGourang Paul
- Shelf SedimentsUploaded byRachel Vega
- Gang Wu 201406 PhD ThesisUploaded byHairena Norashikin
- Quotient SpacesUploaded byducquang00
- You Are the Master of Your Universe_Nancy Mansell OneilUploaded byMagicien1
- Lect09-Matbal Nonreactive ProcessUploaded byAnasua Pal
- Chapter 5 the Dummy Variable Trap %28EC220%29Uploaded byyen
- GaussUploaded byYuxdar Contell
- IT Interview QuestionsUploaded byCareerMonk Publications
- The Estimation of Uncertainties in Hardness Measurements_GABAUERUploaded bySuta Vijaya
- BE 2008.pdfUploaded byNinad Patil
- A Single-Feed Dual Band PIFA WithUploaded bysusanta_gaan