Professional Documents
Culture Documents
Report: Housing Price Prediction Model for D. M. Pan National Real Estate Company
[Your Name]
Introduction
This assignment aims to complete the steps of a real-world linear regression problem. A
summary of the research given from the analysis results. While doing the assignment, I will act
as the analyst hired by the D.M. Pan National real Estate Company to design a model to predict
housing prices for homes sold in 2019. The firm's CEO will use the information to assist their
real estate agents in better determining the use of square footage as a benchmark for listing prices
The square footage of houses is a significant predictor of their listing price – the smaller
the square footage, the lower the listing price, and the larger the square footage, the higher the
listing price.
Linear regression will be used to test the above hypothesis. According to Jan and Shieh
(2019), linear regression is used for various reasons. The most important ones include; when
determining the association between two variables; and determining the value of the dependent
variable at a specific value of the independent variable. From the hypothesis designed, I would
expect the scatterplot to rise from bottom left to top right – showing a positive correlation.
Variables of interest in studies – which are measured or observed are called dependent (response)
variables. Other variables which impact the response and can be set or measured by the
Data Collection
Median Housing Price Model for D. M. Pan National Real Estate Company 3
Square footage is the predictor variable, while the listing price is the response variable.
The sample was obtained using the sampling function in the data analysis ToolPac in excel. A
sample of 50 data points was obtained for the analysis. The scatterplot for the two variables is
250,000
200,000
150,000
100,000
50,000
-
1,000 1,200 1,400 1,600 1,800 2,000 2,200 2,400 2,600
Square Feet
A close look at the graph shows an increasing trend, which appears linear. This indicates
that the data can be used to model a linear regression. The trend of the data points above shows a
positive and strong association between square footage and listing price.
Data Analysis
The histograms for the variables are included in Figures 2 and 3 below.
Median Housing Price Model for D. M. Pan National Real Estate Company 4
Generally, the shape of the histograms shows that the variables come from an
approximately normally distributed population. The summary statistics are included below:
Median Housing Price Model for D. M. Pan National Real Estate Company 5
The mean square feet and listing prices are 1887.6ft and $276766 respectively. The
standard deviation for square feet and listing price are 317.21ft and $58961.16 respectively. The
variables' data points are far apart due to their high sample variances. The histograms also
indicate that the variables come from an approximately normally distributed population. There
are no outliers.
The national population's mean square feet and listing prices are 2,111ft and $342,365.
Their standard deviations are 921ft and $125,914, respectively. The histogram for the national
population is positively skewed since it has a long right tail. Compared to the sample population,
the national statistics are higher than those for the selected samples. The selected sample
represents the national housing market sales. In most cases, a sample of 30 points is considered
R² = 0.35090706803066
250,000
200,000
150,000
100,000
50,000
-
1,000 1,200 1,400 1,600 1,800 2,000 2,200 2,400 2,600
Square Feet
A regression model can be developed for the variables. The data points show a linear
trend. There is a positive association between square footage and listing price. The association is
positive since the line of best fit shows an increasing trend – a positive slope. Since the line does
not have a steep slope, the association is not very strong – it is medium (Kumari & Yadav,
2018).
The value of R can be obtained from that of R2. The value of R2 is 0.3509 – as shown in
R=√ 0.3509
= 05924.
y=68,965+ 110.09 x
Median Housing Price Model for D. M. Pan National Real Estate Company 7
There is a positive regression between square footage and listing price. The slope is
110.09, while the constant is 68,965. A unit increase in square footage causes the listing price to
increase by $110.09 when other factors are constant. The intercept ($68,965) shows the listing
According to Schmidt and Finan (2018), a regression equation normally quantifies the
direction and strength of the association between two numerical variables (square footage and
listing price). The value of the R2 quantifies the strength of the association. It shows the
percentage of variation in Y explained by X. The value of R2 is 0.3509. This implies that square
The regression model obtained can determine the value of the listing price given that of
the square footage. I will use the equation to predict how much I should list my house, given that
y=68,965+ 110.09(1930)
y=68,965+ 212,473.7
= $281,438.7
Conclusions
When operating, businesses accumulate many data. These data may be related to sales,
client information, and profit. Often, insights are usually needed so that collected data can be
used to enhance business decisions. Linear regression is a statistical method businesses use to
find insight into their data and enhance their decisions (Maulud & Abdulazeez, 2020). In this
predictor of house listing prices. It has been determined that the two variables have a positive
linear relationship. As the square footage of a house increases, the listing price also increases.
Similarly, as square footage decreases, the listing price also decreases. The strength of these
variables’ association is medium. The hypothesis that square footage is a significant predictor of
listing price is proven. It has been shown that square footage determines the prices of the houses
in the studied market. The business can use the regression equation modeled to estimate the
listing prices of the houses. The business needs to note that other factors – apart from square
footage – dictate the listing prices of houses. Square footage only explains a 35.09% variation in
listing prices. As much as the business should use square footage to estimate the listing prices of
the houses, there are other factors to consider which are not explained by the model (like house
location, accessibility to towns and roads, furnishes, and designs of the houses).
Median Housing Price Model for D. M. Pan National Real Estate Company 9
References
Bartlett, P. L., Long, P. M., Lugosi, G., & Tsigler, A. (2020). Benign overfitting in linear
Jan, S. L., & Shieh, G. (2019). Sample size calculations for model validation in linear regression
Kumari, K., & Yadav, S. (2018). Linear regression analysis study. Journal of the Practice of
machine learning. Journal of Applied Science and Technology Trends, 1(4), 140-147.
Schmidt, A. F., & Finan, C. (2018). Linear regression and the normality assumption. Journal of