15 views

Uploaded by shinde_jayesh2005

s

- 12
- lecture-6
- Replacement Analysis of Aging Equipments
- Linear Regression
- Development Of Mathematical Models To Forecasting The Monthly Precipitation
- Least-Squares Regression Line
- Spatial Dynamics in an Estuarine System Modeling Biophysical Components and Interactions to Advance Blue Crab Fishery Management.pdf
- Om- Sandeep Raina b54!27!06 OM-I Week1 s10-11 Conceptual Exercise
- Chapter7AConciseGuidetoMarketResearch_book.pdf
- 2014-005
- MultipleRegressionREview.pdf
- NEST
- How to Do a Regression in Excel 2007
- Session: Modeling, Simulation and Optimization
- Rsh Qam11 Excel and Excel QM ExplsM2010
- Ordinary Least Squares_
- EC220 DEF
- NONLIN
- Econometrics
- 6 - 3 - Assumptions (12-13).Srt

You are on page 1of 4

)ages 22*-22+.

ntroduction

Ordinary least-squares (OLS) regression is a generalized linear modelling technique that may be used to model a single response variable which has been recorded on at least an interval scale. The technique may be applied to single or multiple e planatory variables and also categorical e planatory variables that have been appropriately coded.

,ey -eatures

!t a very basic level" the relationship between a continuous response variable (#) and a continuous e planatory variable ($) may be represented using a line o% best-%it" where # is predicted" at least to some e tent" by $. &% this relationship is linear" it may be appropriately represented mathematically using the straight line equation '# ( ) * + '" as shown in ,igure (this line was computed using the least-squares procedure. see /yan" -001). The relationship between variables # and $ is described using the equation o% the line o% best %it with ) indicating the value o% # when $ is equal to zero (also 2nown as the intercept) and + indicating the slope o% the line (also 2nown as the regression coe%%icient). The regression coe%%icient + describes the change in # that is associated with a unit change in $. !s can be seen %rom ,igure -" + only provides an indication o% the average e pected change (the observed data are scattered around the line)" ma2ing it important to also interpret the con%idence intervals %or the estimate (the large sample 034 two-tailed appro imation o% the con%idence intervals can be calculated as + 5 -.06 s.e. +). &n addition to the model parameters and con%idence intervals %or +" it is use%ul to also have an indication o% how well the model %its the data. 7odel %it can be determined by comparing the observed scores o% # (the values o% # %rom the sample o% data) with the e pected values o% # (the values o% # predicted by the regression equation). The di%%erence between these two values (the deviation" or residual as it is also called) provides an indication o% how well the model predicts each data point. !dding up the deviances %or all the data points a%ter they have been squared (this basically removes negative deviations) provides a simple measure o% the degree to which the data deviates %rom the model overall. The sum o% all the squared residuals is 2nown as the residual sum o% squares (/SS) and provides a measure o% model-%it %or an OLS regression model. ! poorly %itting model will deviate mar2edly %rom the data and will consequently have a relatively large /SS" whereas a good-%itting model will not deviate mar2edly %rom the data and will consequently have a relatively small /SS (a per%ectly %itting model will have an /SS equal to zero" as there will be no deviation between observed and e pected values o% #). &t is important to understand how the /SS statistic (or the deviance as it is also 2nown. see !gresti"-006" pages 96-97) operates as it is used to determine the signi%icance o% individual and groups o% variables in a regression model. ! graphical illustration o% the residuals %or a simple regression model is provided in ,igure 8. 9etailed e amples o% calculating deviances %rom residuals %or null and simple regression models can be %ound in :utcheson and 7outinho" 8;;<.

The deviance is an important statistic as it enables the contribution made by e planatory variables to the prediction o% the response variable to be determined. &% by adding a variable to the model" the deviance is greatly reduced" the added variable can be said to have had a large e%%ect on the prediction o% # %or that model. &%" on the other hand" the deviance is not greatly reduced" the added variable can be said to have had a small e%%ect on the prediction o% # %or that model. The change in the deviance that results %rom the e planatory variable being added to the model is used to determine the signi%icance o% that variable's e%%ect on the prediction o% # in that model. To assess the e%%ect that a single e planatory variable has on the prediction o% #" one simply compares the deviance statistics be%ore and a%ter the variable has been added to the model. ,or a simple OLS regression model" the e%%ect o% the e planatory variable can be assessed by comparing the /SS statistic %or the %ull regression model (# ( ) * + ) with that %or the null model (# ( )). The di%%erence in deviance between the nested models can then be tested %or signi%icance using an ,-test computed %rom the %ollowing equation.

F df

df

p+q

,df

p+q

/ df

p+q

where p represents the null model" # ( )" p+q represents the model # ( ) * + " and df are the degrees o% %reedom associated with the designated model. &t can be seen %rom this equation that the ,-statistic is simply based on the di%%erence in the deviances between the two models as a %raction o% the deviance o% the %ull model" whilst ta2ing account o% the number o% parameters. &n addition to the model-%it statistics" the /-square statistic is also commonly quoted and provides a measure that indicates the percentage o% variation in the response variable that is =e plained' by the model. /-square" which is also 2nown as the coe%%icient o% multiple determination" is de%ined as

R8 =

and basically gives the percentage o% the deviance in the response variable that can be accounted %or by adding the e planatory variable into the model. !lthough /-square is widely used" it will always increase as variables are added to the model (the deviance can only go down when additional variables are added to a model). One solution to this problem is to calculate an ad>usted /-square statistic (/ 8a) which ta2es into account the number o% terms entered into the model and does not necessarily increase as more terms are added. !d>usted /-square can be derived using the %ollowing equation

R8 a

k - R8 = R n k 8

where n is the number o% cases used to construct the model and k is the number o% terms in the model (not including the constant).

! simple OLS regression model with a single e planatory variable can be illustrated using the e ample o% predicting ice cream sales given outdoor temperature (?oteswara" -01;). The model %or this relationship

(calculated using so%tware) is &ce cream consumption ( ;.8;1 * ;.;;@ temperature. The parameter %or ) (;.8;1) indicates the predicted consumption when temperature is equal to zero. &t should be noted that although the parameter ) is required to ma2e predictions o% ice cream consumption at any given temperature" the prediction o% consumption at a temperature o% zero might be o% limited use%ulness" particularly when the observed data does not include a temperature o% zero in it's range (predictions should only be made within the limits o% the sampled values). The parameter + indicates that %or each unit increase in temperature" ice cream consumption increases by ;.;;@ units. The signi%icance o% the relationship between temperature and ice cream consumption can be estimated by comparing the deviance statistics %or the two nested models in the table below. one that includes temperature and one that does not. This di%%erence in deviance can be assessed %or signi%icance using the ,-statistic.

!ode0 consumption ( a consumption ( ) * + temperature de'iance (RSS) ;.-833 ;.;3;; d% 80 8< change in de'iance ;.;133 --statistic )-'a0ue

A8.8<

B.;;;-

On the basis o% this analysis" outdoor temperature would appear to be signi%icantly related to ice cream consumption with each unit increase in temperature being associated with an increase o% ;.;;@ units in ice cream consumption. Csing these statistics it is a simple matter to also compute the /-square statistic %or this model" which is ;.;133D;.-833" or ;.6;. Temperature Ee plainsF 6;4 o% the deviance in ice cream consumption (i.e." when temperature is added to the model" the deviance in the # variable is reduced by 6;4).

The OLS regression model can be e tended to include multiple e planatory variables by simply adding additional variables to the equation. The %orm o% the model is the same as above with a single response variable (#)" but this time # is predicted by multiple e planatory variables ($ - to $@). # ( ) * +-$- * +8$8 * +@$@ The interpretation o% the parameters () and +) %rom the above model is basically the same as %or the simple regression model above" but the relationship cannot now be graphed on a single scatter plot. ) indicates the value o% # when all vales o% the e planatory variables are zero. Gach + parameter indicates the average change in # that is associated with a unit change in $" whilst controlling %or the other e planatory variables in the model. 7odel-%it can be assessed through comparing deviance measures o% nested models. ,or e ample" the e%%ect o% variable $@ on # in the model above can be calculated by comparing the nested models # ( ) * +-$- * +8$8 * +@$@ # ( ) * +-$- * +8$8 The change in deviance between these models indicates the e%%ect that $ @ has on the prediction o% # when the e%%ects o% $- and $8 have been accounted %or (it is" there%ore" the unique e%%ect that $ @ has on # a%ter ta2ing into account $- and $8). The overall e%%ect o% all three e planatory variables on # can be assessed by comparing the models # ( ) * +-$- * +8$8 * +@$@ # ( ). The signi%icance o% the change in the deviance scores can be assessed through the calculation o% the ,statistic using the equation provided above (these are" however" provided as a matter o% course by most so%tware pac2ages). !s with the simple OLS regression" it is a simple matter to compute the /-square statistics.

! multiple OLS regression model with three e planatory variables can be illustrated using the e ample %rom the simple regression model given above. &n this e ample" the price o% the ice cream and the average income o% the neighbourhood are also entered into the model. This model is calculated as &ce cream consumption ( ;.-01 H -.;AA price * ;.;@@ income * ;.;;@ temperature. The parameter %or ) (;.-01) indicates the predicted consumption when all e planatory variables are equal to zero. The + parameters indicate the average change in consumption that is associated with each unit increase in the e planatory variable. ,or e ample" %or each unit increase in price" consumption goes down by -.;AA units. The signi%icance o% the relationship between each e planatory variable and ice cream consumption can be estimated by comparing the deviance statistics %or nested models. The table below shows the signi%icance o% each o% the e planatory variables (shown by the change in deviance when that variable is removed %rom the model) in a %orm typically used by so%tware (when only one parameter is assessed" the ,-statistic is equivalent to the t-statistic (, ( It) which is o%ten quoted in statistical output).

de'iance change coe%%icient /rice inco(e te(/erature residua0s ;.;;8 ;.;-;.;<8 ;.;@3 86 ,-"86( -.361 ,-"86( 1.01@ ,-"86( 6;.838 ;.888 ;.;;0 B;.;;;d% --'a0ue )-'a0ue

Jithin the range o% the data collected in this study" temperature and income appear to be signi%icantly related to ice cream consumption.

3onc0usion

OLS regression is one o% the ma>or techniques used to analyse data and %orms the basis o% many other techniques (%or e ample !KOL! and the Meneralised linear models" see /uther%ord" 8;;-). The use%ulness o% the technique can be greatly e tended with the use o% dummy variable coding to include grouped e planatory variables (see :utcheson and 7outinho" 8;;<" %or a discussion o% the analysis o% e perimental designs using regression) and data trans%ormation methods (see" %or e ample" ,o " 8;;8). OLS regression is particularly power%ul as it relatively easy to also chec2 the model asumption such as linearity" constant variance and the e%%ect o% outliers using simple graphical methods (see :utcheson and So%roniou" -000).

-urther Reading

!gresti" !. (-006). An Introduction to Categorical ata Anal!sis. Nohn Jiley and Sons" &nc. ,o " N. (8;;8). An R and S-"lus Co#panion to Applied Regression$ LondonO Sage Publications. :utcheson" M. 9. and 7outinho" L. (8;;<). Statistical 7odeling %or 7anagement. Sage Publications. :utcheson" M. 9. and So%roniou" K. (-000). %&e 'ulti(ariate Social Scientist$ LondonO Sage Publications. ?oteswara" /. ?. (-01;). Testing %or the &ndependence o% /egression 9isturbances. )cono#etrica" @<O"01--1. /uther%ord" !. (8;;-). &ntroducing !KOL! and !KQOL!O a ML7 approach. LondonO Sage Publications. /yan" T. P. (-001). 'odern Regression 'et&ods$ QhichesterO Nohn Jiley and Sons. Mraeme :utcheson 7anchester Cniversity

- 12Uploaded bymichsantos
- lecture-6Uploaded bynourhan yousef
- Replacement Analysis of Aging EquipmentsUploaded byvaibhav_kapoor
- Linear RegressionUploaded byGiancarlo Santos
- Development Of Mathematical Models To Forecasting The Monthly PrecipitationUploaded byAJER JOURNAL
- Least-Squares Regression LineUploaded byGabriel Sim
- Spatial Dynamics in an Estuarine System Modeling Biophysical Components and Interactions to Advance Blue Crab Fishery Management.pdfUploaded bychokbing
- Om- Sandeep Raina b54!27!06 OM-I Week1 s10-11 Conceptual ExerciseUploaded bysandeepraina
- Chapter7AConciseGuidetoMarketResearch_book.pdfUploaded byAbhishek Kansal
- 2014-005Uploaded byTBP_Think_Tank
- MultipleRegressionREview.pdfUploaded byAaron KW
- NESTUploaded byRaj Singh
- How to Do a Regression in Excel 2007Uploaded byMinh Thang
- Session: Modeling, Simulation and OptimizationUploaded bycygnumvignesh89
- Rsh Qam11 Excel and Excel QM ExplsM2010Uploaded byYusuf Hussein
- Ordinary Least Squares_Uploaded byPatrice Furaha
- EC220 DEFUploaded byAmber Roberts
- NONLINUploaded byPf Neiza
- EconometricsUploaded byLynda Mega Saputry
- 6 - 3 - Assumptions (12-13).SrtUploaded byJoe Turner
- A New Temperature-Based Model for Estimating Global Solar Radiation in Port-Harcourt, South-South Nigeria.Uploaded bytheijes
- Private InvestmentUploaded byhaiderhira
- ECO 740Uploaded byAhmad Asifa Nural Anuar
- Recapitalization. 1Uploaded byndukweg
- 07 Simple Linear Regression Part2Uploaded byRama Dulce
- ISI JRF Quantitative EconomicsUploaded byapi-26401608
- BeerUploaded byMC Badlon
- An Investigation of Factors Influencing Design Team Atrributes Ingreen BuildingUploaded byhusnasabry
- Employee Characteristics AnalysisUploaded byKimberly Denz
- Measures Earnings QualityUploaded byLiew Chee Kiong

- NS Question BankUploaded byshinde_jayesh2005
- 4.286SYBScITSyllabusUploaded byshinde_jayesh2005
- VMware_paravirtualizationUploaded byshinde_jayesh2005
- ISM WriteUpsUploaded byshinde_jayesh2005
- les5eppt05-160531024400Uploaded byshinde_jayesh2005
- 2-160731150038Uploaded byshinde_jayesh2005
- University of Mumbai Bachelor Be Mobile Computing Semester 7 Be Fourth Year 2011-2-983d38a536c64a5da4ce201d538b925dUploaded byshinde_jayesh2005
- ccI N D E X.docxUploaded byshinde_jayesh2005
- DS-CH8Uploaded byshinde_jayesh2005
- Ugc API Form 2016Uploaded byShri Ram
- Ft-testsUploaded byYasir Irfan
- Validation ControlsUploaded byshinde_jayesh2005
- Java Theory QuetsionUploaded byDharani Kumar
- Threat Model for BwappUploaded byshinde_jayesh2005
- Master PagesUploaded byshinde_jayesh2005
- Linear and Binary SearchUploaded byshinde_jayesh2005
- Chapter 8 SlidesUploaded byshinde_jayesh2005
- ASP.NET.pptxUploaded byshinde_jayesh2005
- Chapter 1 SlidesUploaded byshinde_jayesh2005
- Chapter 2 slides.pptxUploaded byshinde_jayesh2005
- Chapter 15 SlidesUploaded byshinde_jayesh2005
- Chapter 14 SlidesUploaded byshinde_jayesh2005
- Virtualization UNIT IUploaded byshinde_jayesh2005
- Supervisor Chart March-April 2017Uploaded byshinde_jayesh2005
- Projects.docUploaded byshinde_jayesh2005
- aip finalUploaded byshinde_jayesh2005
- lppUploaded byshinde_jayesh2005
- RIP on Packet Tracer ~ Easy LearningUploaded byshinde_jayesh2005
- Chapter 6 SlidesUploaded byshinde_jayesh2005
- Java Theory QuetsionUploaded byDharani Kumar

- 130Uploaded byRaviSingh
- Felix e. Browder, Nonlinear Accretive Operators in Banach SpacesUploaded byAnonymous va7umdWyh
- Research DesignUploaded byVrinda Kaushal
- 01 Roger Pope Sscs CeUploaded byOprisor Costin
- Industrial Safety & Hazard Management (1)Uploaded byMohsin Khan
- Simulador JKSimMet Rev2.0Uploaded byfabiola
- STAT 200 Final Exam (Spring 2016)Uploaded byheaton073bradford
- Anderson Darling TestUploaded byacgvianna
- WBI06_01_que_20160126Uploaded bySony Avio
- IdiomsUploaded bymychief
- Design of Box Culvert -Aashto Lrfd 2007 OokUploaded bycoreteam
- Assessment of Nakamura Methodology for Evaluating Soil Liquefaction PotentialUploaded byPedro Torres
- Best Practices in Adopting a Shared Services ModelUploaded byGaurav Gupta
- ValidationUploaded byAdenike Majekodunmi
- MBA Projects Marketing ScribdUploaded bytejaas
- hinely cl assignment7 7490Uploaded byapi-319825873
- Nursing Ethics NetworkUploaded byKarina Lingan
- Stats MidtermUploaded byNhi Thi Tam Tran
- Brachistochrone Problem - HistoryUploaded byNinaReščič
- IBM SPSS Amos 19 User's GuideUploaded byinciamos
- Research Proposal LEADERSHIPUploaded byAdaamShabbir
- ABBBBBUploaded bychatfieldlohr
- Sure ShUploaded byVikram Thakur
- CHAPTER IUploaded byShaira Joy Rejuso Manalo
- ice_class_gn_e-feb14.pdfUploaded byolivo farrera jr
- Full GuidelinesUploaded byTania Violeta Bustos Roldán
- ReportUploaded byAna Melgar
- Swot Analysis ABCUploaded bymohsin_20008896
- Bec Vantage 04SUploaded byAnonymous t0HdWfGbl
- 4634.pdfUploaded byMuthu Kumar