You are on page 1of 22

REYEM AFFAIR

REGRESSION CASE QUANTITATIVE METHODS II TO

PROF. ARNAB BASU


ON OCTOBER 21, 2011

BY GROUP NO. 5
AKSHAY RAM (1111004) ARUN PRABU (1111010) BHARTI VISHAL (1111016) DHANASHREE VINAYAK SHIRODKAR (1111022) GHULE NILESH VISHNU (1111028) AMOL DEVNATH KUMBHARE (1111034) MUDAVATH SWETHA (1111040) SUPREET KUMAR(1111046) RAJA SIMON J (1111052) SAGAR BEHERA (1111058) SHREYA SETHI (1111065) SWATI MURARKA (1111071) AJUSAL SUGATHAN (1111077)

INDIAN INSTITUTE OF MANAGEMENT, BANGALORE

Table of Contents
S.No Particulars Pages 3-4 4 5-13 5-8 6 7 8 9-13 11-13 13 14 15

1. 2. 3.

Executive Summary Understanding of the Problem Model Description


Model 1 Prediction interval Vs Confidence Interval Step wise Regression: A closer look Test of Model: Analysis of Results Model 2 Test of Model: Analysis of Results Other Models

4. 5.

Conclusions and Recommendations Appendix


1. 2. 3. 4. 5. Variables Entered/Removed Model Summary ANOVA Coefficients Residual Statistics

Executive Summary
Reyem Affiar has recently found the below described condominium in Mid-Cambridge that he wants to purchase. Street Address Last Price Area & Area Code Bed Bath Rooms Interior Condo Tax RC : 236 Ellery Street : $169000 : M/9 :2 :1 :5 : 1040 : $175 : $1121 : 1(Restrictions on monthly rent that owner may charge)

Even though Affiar is monetarily capable of paying the asking price of $169000, generally negotiations from buyers agent keeps the selling price lower than the last asking price. Given the above information, based on the data that Reyem Affiar has on condominiums sold in Cambridge the past five years, we need to help Reyem Affiar to decide on a fair offer price.

Solution Approach
An estimate for selling price of the above condominium needs to be made. Hence selling price is clearly the dependent variable Y for the regression model. Clearly first date, close date and number of days between the two (Days) cannot be part of the independent variable set since we do not have these information for the 236 Ellery Steet Condominium yet (since the sale has not taken place yet). Further the condominium of interest lies in area M (9), hence one could possibly analyze only the data on the 111 condominiums from the same area and ignore the rest. On the other hand, if we can set up independent dummy variables for the area/area codes, these can be incorporated into our regression model and then we will have a bigger sample of 456 data-points to make a better and more accurate prediction for Affiar. This will be explained in detail in the model description. Stepwise regression in SPSS has been adopted for variable selection. This method, being a combination of forward selection

and backward elimination techniques for variable selection, avoids the errors in regression model that can be committed due to multi-collinearity.

Figure 11.45 from Pg 571

Understanding of the Problem


Selection of independent variables is the key to arriving at a good regression model. On first look at the given data, one can clearly see that the possible independent variables that may be affecting the selling price could be first price, last price, number of days between first and last date, location (Area), number of bedrooms, number of bathrooms, number of rooms, interior space, condominium taxes, yearly property tax and rent control. But we have assumed that the given asking price of $169000 for the Ellery Street condominium is the last price since the transaction could possibly happen on the next day (May 4, 1994). This means we dont have information on the first price for the Ellery Street

condominium, hence we remove first price from our possible independent variable list. As stated before in section 1.1, we cannot have number of days between first and last date as an independent variable either since the sale of condominium has not happened and we dont have information on the first date the condominium was put on sale. Finally, we can intuitively see that there will be a positive correlation between interior space and number of rooms, bathrooms and bedrooms. Since interior space can be representative of all, to avoid the issue of multi-collinearity, interior space can very well act as a good proxy in our regression model for number of rooms, bathrooms and bedrooms. We will also show this through the output generated in the model description section. Further, one can also expect last price

and interior space to have positive coefficients while condominium taxes, property taxes and RC to have negative coefficients. Effect of the other dummy variables for area/area codes need to be explored by running the regression model.

Model Description Model 1


The model(Appendix) can be described as follows (Exhibit 1): Sale Price = 0.333*Last Price + 35.947*Tax + 44.967*Interior + 105.108*Condo + 10992.327*RC + 12290.704*A2 + 29804.817*A5 27984.595*A12 12447.291*A16 - 15967.736 Where A2, A5, A12 and A16 are the dummy variables associated with areas Avon Hill, East Cambridge, Porter Square and West Cambridge respectively. They will take values of 1 or 0 depending on whether we are to predict the price of a condominium in that area. For 236 Ellery Street Apartment, we have Sale Price = 0.333*169000 + 35.947*1121 + 44.967*1040 + 105.108*175 + 10992.327*1 15967.736 = 156757.758 95% prediction interval for the Selling price of 236 Ellery Street Condominium is given by:

= 156757.758 t[0.025,(456-10)](30268.701252 + 9.162 * 108)0.5 = 156757.758 1.9653 *(30268.701252 + 9.162 * 108)0.5 = 156757.758 84127.57 = {72630.188, 240885.328}

The standard error and MSE are taken from the regression output table (Appendix). Now, a 95% Confidence Interval for the Selling Price (conditional mean) of 236 Ellery Street Condominium would be given by:

= 156757.758 t[0.025,(456-10)](4021) = 156757.758 1.9653 *(4021)

= 156757.758 7902.471 = {148855.29, 164660.23} The standard error of mean predicted value is taken from the Residual Statistics table (Appendix). Exhibit 1: Regression Model Coefficients

Prediction interval Vs Confidence Interval


We have calculated the prediction interval and confidence interval for E(Sale Price) for the Ellery street condominium for the given input independent variables (Section 1). While the predicted value and the estimate of the mean value of Y(Sale Price here) are equal, the prediction interval is wider than a confidence interval for E(Y) using the same confidence level. There is more uncertainty about the predicted value than there is about the average value of Y given the values of Xi. Based on the confidence interval, the recommendation for Affiar would be to not bid more than the upper limit value of $165,717 since he can be confident to a level of 97.5% (100% 5%/2) that the final selling price (mean) of the condominium would be below this number. So $164,660 is the maximum that he should bid on the condominium. If he were to be more conservative in his bid, then he can go by the prediction interval. Since the upper limit of the prediction interval $240,885 is greater than the asking price of $169000, his bid should be $169,000 in this case. The maximum he can afford to bid for the house with a 95% confidence level would be $240,885.

Step-wise regression: A closer look


Given the possible set of 23 independent variables (Last Price, Bed, Bath, Rooms, Interior, Condo, Tax, RC, A1,A2, A3, A4, A5, A6, A7, A8, A10, A11, A12, A13, A14, A15, A16), the algorithm starts by finding the most significant single-variable regression model. So Last Price with the highest F-value and hence a p-value < pin enters the regression model (note pin = 0.05). Now the other 22 variables left out of the model are checked via a partial F-test, and the most significant variable, Tax, is now added to the model.Now the original variable Last Price is reevaluated to see if it meets the preset significance standard of p-value < pout(note pin = 0.10). Since it meets this criterion, the variable is retained in the model. Now again the other 21 variables outside the model are checked via a partial F-test, and the most significant variable, now Interior, enters the model. All variables in the model, namely Last Price and Tax are now checked again for staying significance. The procedure continues until there are no variables outside that should be added to the model and no variables inside the model that should be out. On 9 th iteration, this happens for Model 1 as shown in Appendix. To illustrate how the issue of multi-collinearity is inherently taken of in this Step-wise regression technique, a regression analysis was done between rooms and interior variables and it was found that these two were highly correlated (Appendix). Obviously, the step-wise regression took the more significant variable Interior in the final regression model eliminating the lesser significant highly correlated Rooms variable from the final regression model.

Let us check if the models regression assumptions are satisfied through Residual Analysis: From the normality histogram for residuals shown in the figure below, it is clear that the normality assumption is satisfied since the residuals (standardized) seem to be normally distributed. The normal P-P graph also confirms the same. Lastly homoscedasticity can be seen from the residual scatter plot where the residuals are scattered around the mean 0 in a random fashion with no observable pattern or heteroscedasticity. Finally the independence assumption between the independent variables is

inherently taken care of in the step-wise regression technique which checks for multi-collinearity after each stage (as shown in Figure 1) with a Pin = 0.05 and Pout = 0.10. Hence the algorithm automatically kicks out of the model variables that are correlated to each other and keeps only the most significant independent variables inside the model. independent variable is shown in Appendix. The individual residual plots of residual error Vs each

Test of Model: Analysis of Results


Significance of model: From Appendix, ANOVA table shows that that F-value for model 2 is 7828 with a significant p-value of 0. Since p-value < 0.05, we reject the null hypothesis (1= ..= 11 = 0) and hence there is atleast one i that is significant. We will look at the coefficients table to ensure the coefficients are significantly different from zero. As we can see from the coefficients table for Model 1, the p-values for coefficients are lesser than 0.05 (alpha value). Hence we reject the null-hypothesis for each i(i.e. i = 0) and thus the coefficients are significant. Finally we look at the Adjusted-R2 (since this accounts for the increase in R2 due to an increase in number of independent variables) values for goodness-of-fit test. A high Adjusted R2 value of 0.886 in this case (Appendix) suggests that 88.6% of the variation in Sale Price is explained by the regression model.

Model 2:

In Model 1, we have clearly accounted for the areas/area codes of condominiums by starting with the 15 dummy variables for our step-wise regression analysis. One could very well argue that condominiums outside of Mid-Cambridge should not be considered for analysis. Hence step-wise regression was run with only the 111 data points from Mid-Cambridge condominiums. The step-wise regression was started with the input independent variables including Last Price, Bed, Bath, Rooms, Interior, Condo, Tax and RC. But Last Price and RC were the only independent variables that seem to have a significant impact on the Selling Price. The step-wise regression with a Pin = 0.05 and Pout = 0.10 was carried out, as we can see from Appendix, Last Price and RC were the only independent variables with a significant impact (based on step-wise partial F-test) on Selling Price. The model can be summarized as below:

Selling Price = 0.96 * Last Price + 1935.903 * RC 2181.178

For the Ellery Street condominium, we have: Selling Price = 0.96 * 169000 + 1935.903 * 1 2181.178 = $161,994.725

Similar to model 1, 95% prediction interval for the Selling price of 236 Ellery Street Condominium is given by :

= 161,994.725t[0.025,(111-3)](4422.9452 + 1.956 * 107)0.5 = 161,994.725 1.98217 *(4422.9452 + 1.956 * 107)0.5 = 161,994.725 12398.064 = {149596.661, 174392.7892} The standard error and MSE are taken from the regression output table (Appendix).

Now, a 95% Confidence Interval for the Selling Price (conditional mean) of 236 Ellery Street Condominium would be given by:

= 161,994.725t[0.025,(111-3)](698.994) = 161,994.725 1.98217 *(698.994) = 161,994.7251385.525 = {160609.2,163380.25} The standard error of mean predicted value is taken from the Residual Statistics table (Appendix).

As explained for model 1, there is more uncertainty about the predicted value than there is about the average value of Y given the values of Xi. Based on the confidence interval, the recommendation for Affiar would be to not bid more than the upper limit value of $163,380 since he can be confident to a level of 97.5% (100% 5%/2) that the final selling price (mean) of the condominium would be below this number. So $163,380 is the maximum that he should bid on the condominium. If he were to be more conservative in his bid, then he can go by the prediction interval. Since the upper limit of the prediction interval $174,393 is greater than the asking price of $169000, his bid should be $169,000 in this case. The maximum he can afford to bid for the house with a 95% confidence level would be $174,393.

Coefficients

Standardized Unstandardized Coefficients Model 1 (Constant) LastPrice 2 (Constant) LastPrice RC B -544.824 .958 -2181.178 .960 1935.903 Std. Error 1357.461 .008 1541.383 .008 909.479 .998 .017 .996 Coefficients Beta t -.401 123.128 -1.415 124.529 2.129 Sig. .689 .000 .160 .000 .036

a. Dependent Variable: SalePrice

Let us check if the models regression assumptions are satisfied through Residual Analysis: From the normality histogram for residuals shown in the figure below, it is clear that the normality assumption is satisfied since the residuals (standardized) seem to be normally distributed. The normal P-P graph also confirms the same. Lastly homoscedasticity can be seen from the residual scatter plot where the residuals are scattered around the mean 0 in a random fashion with no observable pattern or heteroscedasticity. Finally the independence assumption between the independent variables is

inherently taken care of in the step-wise regression technique which checks for multi-collinearity after each stage (as shown in Figure 1) with a Pin = 0.05 and Pout = 0.10. Hence the algorithm automatically kicks out of the model variables that are correlated to each other and keeps only the most significant independent variables inside the model. independent variable is shown in Appendix. The step-wise regression method adopted works the same way as it was explained for model-1. Here only 2 iterations were required to arrive at the final model as shown in Appendix. The individual residual plots of residual error Vs each

Test of Model: Analysis of Results


Significance of model: From Appendix, ANOVA table shows that that F-value for model 2 is 7828 with a significant p-value of 0. Since p-value < 0.05, we reject the null hypothesis (1= 2= 3 = 0) and hence there is at least one i that is significant. We will look at the coefficients table to ensure the coefficients are significantly different from zero. As we can see from the coefficients table for Model 2, the p-values for coefficients are lesser than 0.05 (alpha value). Hence we reject the null-hypothesis for each i(i.e. i = 0) and thus the coefficients are significant. Finally we look at the Adjusted-R2 (since this accounts for the increase in R2 due to an increase in number of independent variables) values for goodness-of-fit test. A high Adjusted R2 value of 0.993 in this case (Appendix) suggests that 99.3% of the variation in Sale Price is explained by the regression model.

Other Models:
In addition to the above 2 best-fit models, a number of other regression models with different combinations of input independent variables were tried. For instance, areas based on location (with the help of the map provided) were grouped to form lesser number of dummy variables (e.g., grouping Agassiz, Harvard Square and Radcliffe). Multiple such combinations were formed to see how area can be best-fit into the model. Rooms was tried as proxy for interior (due to their high correlation as seen in Appendix). Best fit test for each model based on R2 values, significance of coefficients, residual plots was conducted and the best 2 models have been presented in the case solution. Also in each model, the given price for the Ellery street condominium has been assumed as the Last Price as stated before.

Conclusions and recommendations


Two regression models were presented to fit the given data in order to predict the sale price for the 236 Ellery Street condominium. The summary of the offer price that Affiar should be making on the condominium based on the two models is shown in the table below:

Recommend Mean Selling Price ($) Prediction Interval ($) Confidence Interval ($) ed bid price ($)

Max. Conservativ e bid price ($)

Model

156757.758

{72630.188,240885.328}

{148855.29,164660.23}

164,660

240,885

1
Model

161,994.725

{149596.661,174392.789}

{160609.2,163380.25}

163,380

174,393

Comparing the Adjusted R2 values of the two models, we see that Model 2 is able to explain 99.3% of variation in Sale price against Model 1s 88.6%. Hence one might be tempted to use Model 2. But on a closer look at the independent variables in model 2, Last Price and RC are the only independent variables used. In this case there is not a large difference between the recommended prices for Affiar using model 1 or model 2, but in reality buyer cant base his/her offer just by the sellers stated Last price. Obviously a number of other factors like interior space, tax, apartment maintenance fee, area, etc., need to be considered. From the given data, model 1 has made a comprehensive attempt to form the best possible regression fit by use of maximum data points. Hence the recommendation would be to go by model 1, but in this specific case of the Ellery Street house, since the variation for the predicted selling price from the two models is not much, it is left to Affiar to either make an initial offer of $164,660 or $163,380.

Appendix Model 1
Variables Entered/Removed
Model Variables Entered Last Price Variables Removed . Method Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-toremove >= .100). Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-toremove >= .100). Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-toremove >= .100). Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-toremove >= .100). Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-toremove >= .100). Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-toremove >= .100). Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-toremove >= .100). Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-toremove >= .100). Stepwise (Criteria: Probability-of-F-to-enter <= .050, Probability-of-F-toremove >= .100).

Tax

Interior

Condo

A12

A5

RC

A16

A2

a. Dependent Variable: Sale Price

Model Summary
Change Statistics Model R R Square Adjusted R Std. Error of Square the Estimate R Square Change 1 2 3 4 5 6 7 8 9 .872 .925
a b c

F Change 1437.412 300.700 33.258 31.979 20.174 14.420 9.167 6.037 5.018

df1 1 1 1 1 1 1 1 1 1

df2 454 453 452 451 450 449 448 447 446

Sig. F Change .000 .000 .000 .000 .000 .000 .003 .014 .026

.760 .856 .866 .875 .880 .884 .886 .887 .889

.759 .855 .865 .873 .879 .882 .884 .885 .886

44066.10445 34200.57311 33044.26474 31966.94753 31308.34591 30851.68238 30574.87347 30404.43434 30268.70125

.760 .096 .010 .009 .005 .004 .002 .002 .001

.930 .935 .938

d e f

.940 .941 .942

g h i

.943

a. Predictors: (Constant), Last Price b. Predictors: (Constant), Last Price, Tax c. Predictors: (Constant), Last Price, Tax, Interior d. Predictors: (Constant), Last Price, Tax, Interior, Condo e. Predictors: (Constant), Last Price, Tax, Interior, Condo, A12 f. Predictors: (Constant), Last Price, Tax, Interior, Condo, A12, A5 g. Predictors: (Constant), Last Price, Tax, Interior, Condo, A12, A5, RC h. Predictors: (Constant), Last Price, Tax, Interior, Condo, A12, A5, RC, A16 i. Predictors: (Constant), Last Price, Tax, Interior, Condo, A12, A5, RC, A16, A2 j. Dependent Variable: Sale Price

ANOVA
Model Regression 1 Residual Total Regression 2 Residual Total Regression 3 Residual Total Regression 4 Residual Total Regression 5 Residual Total Regression 6 Residual Total Regression 7 Residual Total Regression 8 Residual Total Regression 9 Residual Total Sum of Squares 2.791E12 8.816E11 3.673E12 3.143E12 5.299E11 3.673E12 3.179E12 4.935E11 3.673E12 3.212E12 4.609E11 3.673E12 3.232E12 4.411E11 3.673E12 3.245E12 4.274E11 3.673E12 3.254E12 4.188E11 3.673E12 3.260E12 4.132E11 3.673E12 3.264E12 4.086E11 3.673E12 df 1 454 455 2 453 455 3 452 455 4 451 455 5 450 455 6 449 455 7 448 455 8 447 455 9 446 455 3.627E11 9.162E8 395.860 .000
i

Mean Square 2.791E12 1.942E9

F 1.437E3

Sig. .000
a

1.571E12 1.170E9

1.343E3

.000

1.060E12 1.092E9

970.531

.000

8.030E11 1.022E9

785.781

.000

6.463E11 9.802E8

659.386

.000

5.409E11 9.518E8

568.279

.000

4.649E11 9.348E8

497.265

.000

4.074E11 9.244E8

440.754

.000

a. Predictors: (Constant), LastPrice b. Predictors: (Constant), LastPrice, Tax c. Predictors: (Constant), LastPrice, Tax, Interior

d. Predictors: (Constant), LastPrice, Tax, Interior, Condo e. Predictors: (Constant), LastPrice, Tax, Interior, Condo, A12 f. Predictors: (Constant), LastPrice, Tax, Interior, Condo, A12, A5 g. Predictors: (Constant), LastPrice, Tax, Interior, Condo, A12, A5, RC h. Predictors: (Constant), LastPrice, Tax, Interior, Condo, A12, A5, RC, A16 i. Predictors: (Constant), LastPrice, Tax, Interior, Condo, A12, A5, RC, A16, A2 j. Dependent Variable: SalePrice

Coefficients
Unstandardized Coefficients Model B 1 Std. Error Beta 9.587 .872 37.913 7.102 .504 18.154 .481 17.341 .625 .466 16.921 .434 15.516 .127 5.767 -1.748 .424 15.336 .335 10.350 .167 .148 7.448 5.655 -1.381 .442 16.136 .321 10.104 .165 .143 7.506 5.554 Standardized Coefficients t Sig. Lower Bound Upper Bound Tolerance VIF 95% Confidence Interval for B Collinearity Statistics

(Constant) 38849.701 4052.438 LastPrice .720 .019

.000 30885.838 46813.564 .000 .683 .758 1.000 1.000

(Constant) 23233.629 3271.562 LastPrice Tax .416 47.547 .023 2.742

.000 16804.307 29662.951 .000 .000 .371 42.158 .461 52.935 .414 .414 2.416 2.416

(Constant) 2954.638 4728.282 LastPrice Tax Interior .385 42.937 33.058 .023 2.767 5.732

.532 -6337.506 12246.782 .000 .000 .000 .341 37.499 21.793 .430 48.375 44.323 1089.051 .396 39.351 55.047 165.805 2883.192 .410 37.935 54.256 160.414 .356 .264 .552 .404 2.809 3.788 1.811 2.472 .363 .266 .552 .405 2.753 3.755 1.810 2.467 .391 .379 .614 2.555 2.636 1.628

(Constant) -8782.305 5022.983 LastPrice Tax Interior Condo .351 33.071 43.555 123.044 .023 3.195 5.848 21.759

.081 -18653.660 .000 .000 .000 .000 .306 26.792 32.062 80.284

(Constant) -6822.788 4938.803 LastPrice Tax Interior Condo .365 31.758 42.998 118.486 .023 3.143 5.729 21.334

.168 -16528.769 .000 .000 .000 .000 .321 25.581 31.740 76.559

A12

37401.549

8327.091

-.074

-4.492 -.819

.000 -53766.363 -21036.736 .413 -13704.814 .000 .000 .000 .000 .303 26.752 32.324 57.171 5641.006 .393 38.979 54.516 142.078

.974

1.026

(Constant) -4031.904 4921.946 LastPrice Tax Interior Condo A12 .348 32.865 43.420 99.625 35648.270 A5 .023 3.111 5.646 21.602 8218.611

.421 15.299 .332 10.564 .167 .120 -.071 .068 7.690 4.612 -4.338 3.797 -2.408 .414 15.101 .349 11.018 .170 .126 -.058 .084 .056 7.907 4.888 -3.434 4.512 3.028 -2.483 .407 14.883 .361 11.325 .170 .126 -.058 .081 .062 -.040 7.960 4.900 -3.461 4.413 3.325 -2.457

.342 .262 .552 .383 .971 .802

2.923 3.821 1.812 2.610 1.029 1.247

.000 -51799.991 -19496.550 .000 11743.104 36935.714 .016 -25993.587 -2632.776 .000 .000 .000 .000 .297 28.316 33.293 62.774 .386 40.610 55.317 147.207

24339.409 6409.480 14313.181 5943.400 .023 3.128 5.603 21.481 8438.023

(Constant)

LastPrice Tax Interior Condo A12

.342 34.463 44.305 104.991 28972.746

.339 .254 .550 .380 .905 .742 .740

2.948 3.933 1.817 2.628 1.105 1.347 1.352

.001 -45555.768 -12389.725 .000 16810.400 42758.528 .003 3661.669 17208.658

A5 RC 8 (Constant)

29784.464 6601.659 10435.164 3446.591 14679.636 5912.150 .023 3.147 5.572 21.362 8391.027

.013 -26298.698 -3060.575 .000 .000 .000 .000 .292 29.457 33.404 62.691 .381 41.827 55.304 146.656 .336 .248 .550 .380 .905 .741 .729 .951 2.974 4.027 1.817 2.628 1.105 1.350 1.373 1.052

LastPrice Tax Interior Condo A12

.337 35.642 44.354 104.674 29037.901

.001 -45528.663 -12547.139 .000 16088.348 41922.080 .001 4697.521 18273.455

A5 RC A16

29005.214 6572.515 11485.488 3453.935 13478.468 5485.758

.014 -24259.547 -2697.390

(Constant)

15967.736

5913.780 .023 3.136 5.554 21.268 8366.791

-2.700 .403 14.763 .364 11.462 .173 .127 -.056 .084 .059 -.037 .036 8.097 4.942 -3.345 4.548 3.190 -2.271 2.240

.007 -27590.071 -4345.402 .000 .000 .000 .000 .289 29.783 34.052 63.311 .377 42.110 55.882 146.906 .335 .248 .549 .380 .902 .738 .726 .944 .967 2.988 4.035 1.821 2.629 1.108 1.354 1.378 1.059 1.034

LastPrice Tax Interior Condo A12

.333 35.947 44.967 105.108 27984.595

.001 -44427.826 -11541.364 .000 16926.416 42683.218 .002 4220.785 17763.869

A5 RC A16

29804.817 6552.903 10992.327 3445.556 12447.291 5480.634

.024 -23218.366 -1676.216 .026 1507.625 23073.784

A2

12290.704 5486.742

a. Dependent Variable: SalePrice

Residuals Statistics
Minimum Predicted Value Std. Predicted Value Standard Error of Predicted Value Adjusted Predicted Value Residual Std. Residual Stud. Residual Deleted Residual Stud. Deleted Residual Mahal. Distance Cook's Distance Centered Leverage Value a. Dependent Variable: SalePrice 2.1894E4 -1.761 1971.030 1.6813E4 -3.59573E5 -11.879 -20.352 -1.05539E6 -76.135 .932 .000 .002 Maximum 7.3736E5 6.686 2.458E4 1.1794E6 1.37644E5 4.547 4.861 Mean 1.7108E5 .000 4.021E3 1.7253E5 .00000 .000 -.017 Std. Deviation 84699.37571 1.000 1982.252 95574.81320 29967.84529 .990 1.268 55783.52632 3.664 16.348 3.753 .036 N 456 456 456 456 456 456 456 456 456 456 456 456

1.57295E5 -1.45182E3 4.990 298.983 80.153 .657 -.139 8.980 .179 .020

Interior Vs Rooms Regression results showing correlation


SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations ANOVA df Regression Residual Total Coefficients -76.7538578 235.8872688 1 454 455 SS MS F 32571418.05 32571418 686.9981 21524693.45 47411.22 54096111.51 Significance F 6.7719E-93

0.775952808 0.60210276 0.601226335 217.7411745 456

Intercept Rooms

Standard Error t Stat P-value 42.08789622 -1.82366 0.068861 8.999672999 26.21065 6.77E-93

Lower 95% Upper 95% -159.4651166 5.957400971 218.2010847 253.5734529

Interior

5000 0 0 50 100 150 Sample Percentile

Residuals

Normal Probability Plot

2000 1000 0 0

Rooms Residual Plot

-1000

5 Rooms

10