Professional Documents
Culture Documents
The Boston data set records the median house value (medv) for 506 census tracts in Boston. There are 12 predictors for
medv, some of which are:
CRIM - per capita crime rate by town
INDUS - proportion of non-retail business acres per town.
NOX - nitric oxides concentration (parts per 10 million)
RM - average number of rooms per dwelling
For each of the four predictors above, fit a simple linear regression model to predict the response using Excel.
i) For each model, paste the line fit plot and write down the simple linear regression model. Is there a statistically
significant association between the predictor and the response?
Regression on CRIM
y = 24.0331 + -0.4152
p-value < 0.05; implies that CRIM has statistically significant association with MEDV
Regression on INDUS
y = 29.7549 + -0.6485
p-value < 0.05; implies that INDUS has statistically significant association with MEDV
Regression on NOX
Neal Pania
y = 41.3459 + -33.9161
p-value < 0.05; implies that NOX has statistically significant association with MEDV
Regression on RM
y = -34.6706 + 9.1021
p-value < 0.05; implies that RM has statistically significant association with MEDV
ii) For each model, paste the residual plot. Comment on whether or not the observed plot is completely uniformly
random.
Neal Pania
Residuals for NOX do appear to be mostly random, with overall residuals appearing to have no trend. When
looking at the larger residuals at the top of the plot, there appears to be an increasing trend, though these data
points are few in numbers. In terms of the distribution of residuals, the positive residual value range is larger
than the negative residual value range.
Residual on RM Regression
Residuals for RM do appear to be mostly random, with the overall residuals appearing to have no trend. When
looking at the few datapoints with lager residual values, a negative trend can be seen, however these data
points are so few and majority of the data appears to be random.