0% found this document useful (0 votes)
81 views13 pages

Project

This project analyzes the Boston Housing dataset to identify key factors influencing housing prices using statistical techniques. Key findings include significant correlations between the number of rooms and housing costs, as well as the impact of proximity to the Charles River and highway accessibility on median home values. The regression model explains 74% of the variance in housing prices, indicating strong predictive power.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views13 pages

Project

This project analyzes the Boston Housing dataset to identify key factors influencing housing prices using statistical techniques. Key findings include significant correlations between the number of rooms and housing costs, as well as the impact of proximity to the Charles River and highway accessibility on median home values. The regression model explains 74% of the variance in housing prices, indicating strong predictive power.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Housing

Prices
in Boston
Group - 10

Akhil Augustine
Akhil Dev
Ardhra
Santhosh
Naveen Shaji
Thomas Kappen
Introducti
• This project aims to explore the on
key factors that influence the housing prices
by analyzing the Boston Housing dataset using statistical and analytical
techniques such as descriptive statistics , hypothesis testing and regression
analysis.

• By applying the statistical techniques and identifying patterns in data , this


helps to support better decision making in urban planning ,housing
development and real estate investment. The key factors identified helps
making houses more affordable and informs citizens about local market
trends .
Dataset
Overview
Boston Housing Dataset -Collected from US census
Data

The purpose of the dataset is to analyze how


various social ,environmental and economic factors
affects housing prices.

The dataset consist of various attributes,around 14


columns with 506 rows.
Research
Questions
Objective : Determine the key factors that influence the housing prices in Boston.

Research Questions

• Is there a significant difference in the house prices based on procimity to charles river .
• Does the number of rooms affect the housing prices .
• How does the highway road accessibility affect the prices of houses.

- Implement a regression model to understand what features helps to predict the housing prices .

Target Variable (dependant ) y = medv ( Median Home Values in 1000s)


Feature Variables (independant) x = crim,zn ,chas, nox, rm ,dis , rad ,tax, indus , age ,
ptratio,b ,lstat
Descriptive
With a mean crime rate of 3.61 and a maximum of 88.98, the crime rate is extremely skewed, indicating that a small number of

Statistics
high-crime locations substantially skew the average.

Average number of rooms is 6.28 which indicates that most are mid sized , but it also show a significant variation in range which can
impacts housing prices .

The large standard deviation and broad range (187 to 711) of property tax (tax) indicate significant variation in municipal levies
across several zones.
Hypothesis
Test
Objective : Determine whether 1 number of rooms(rm)is greater
the average
than 6.
Assume that the significance level is 5 %.
Hypothesis
Test 2
Determine if there is a statistically significant difference in median home
values between the houses that are near to charles river (chas=1) and
those that are not (chas=0).Assuming significance level 5 percent .

H₀: μ₁ = μ₂
H₁: μ₁ ≠ μ₂

From the table ,


|t-stat| > t critical , so we reject H₀ .
So there is a significant difference .
Hypothesis
Test 3
Determine whether the median value of homes (medv) differs between houses
with low highway accessibility (rad<=4) and those with high accessibility
(rad>=4).

H₀: μ₁ = μ₂
H₁: μ₁ ≠ μ₂

From the table ,


|t-stat| > t critical , so we reject H₀.

There is a strong evidence that homes with


low access to highways have higer median
values .
Correlation
• Homes with more rooms typically have much higher costs, according to the strong positive correlation. (rm
vs medv = +0.695).
Matrix
• There is a strong negative correlation ((lstat vs medv = -0.7376) , which shows the median home value is
generaly lower in area with high lower status population.

• Towns with higher student-teacher ratios typically have lower home costs, which may indicate worse
educational quality,, according to a moderately negative connection (ptratio vs medv = -0.508).

• There is no significant correlation between chas and other variables.


Regression Analysis
y = 36.34 - 0.11*crim +0.04*zn + 2.71*chas - 17.37*nox + 3.80*rm - 1.49*dis + 0.29*rad - 0.01*tax -0.94*ptratio + 0.009*b -
0.52*lstat

Coefficient of Determination - R square - 0.7405


This means that 74% of variance in housing prices is
explained by this model.

Standard Error of Estimate - 4.736


The low value of standard error indicates that the predictions
of housing prices are reasonably accurate.

Testing Validity of Model - Significance F - 6E-137


It shows that the overall regression is significant. (p<0.05)

Testing coefficients
Independant variables with p value < 0.05 are significant .
(rm, rad and lstat are the most significant variables.)
GRAPH
S

You might also like