Professional Documents
Culture Documents
On
Project course
BACHELOR OF ENGINEERING
(Co-Supervisor)
Signature:
i
Jay Gautam
20BCS4962
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
CHANDIGARH UNIVERSITY, GHARUAN
June 2022
ABSTRACT
House price forecasting is an important topic of real estate. The literature attempts
to discover useful models for house buyers and sellers. Revealed is the high
discrepancy between house prices in the most expensive and most affordable
competitive approach.
TABLE OF CONTENT
Acknowledgements......................................................................... v
1. Introduction.................................................................................1
ii
Jay Gautam
20BCS4962
1.1 Goal..........................................................................................1
1.2 Need of the application..........................................................1
2. DATASET...................................................................................2
2.2Steps in Preparing Data for Model........................................2
2.3Data Exploration.....................................................................3
2.4Data Visualization...................................................................3
2.5Data Selection..........................................................................4
2.6Data Transformation……………………………………….5-6
3. System Analysis...........................................................................7
3.1 ER Diagram............................................................................8
3.2 Data Flow Diagram................................................................8
3.3 Class Diagram.........................................................................8
4. Design...........................................................................................9
4.1 Design Goals............................................................................9
4.2Architectural Design..........................................................9-10
5.LANGUAGE AND LIBRARIES ...........................................................10
5.1HTML……………………………………………………10
5.2CSS………………………………………………………………...10
5.3 Java script ………………………………………………………..10
5.4 JSX…………………………………………………………………11
5.5 Python…………………………………………………………..11
5.6 Libraries used …………………………………………………11
6 Modules ………………………………………………………12
6.1 Regression Model……………………………………….12
6.2 Random Forest Regression Model…………………….12-13
6.3 XG Boost Regressor Model ……………………………13
7 Result and discussion………………………………………..14
8 Conclusion ……………………………………………………15
iii
Jay Gautam
20BCS4962
Acknowledgements
for his constant guidance and help throughout the project. I would also like to thank
iv
Jay Gautam
20BCS4962
Finally, I would like to thank my family and my friends for all the support and
encouragement.
v
Jay Gautam
20BCS4962
1. INTRODUCTION
1.1 Goal
Identify the important home price attributes which feed the model’s predictive power
Having lived in India for so many years if there is one thing that I had been taking for
granted, it’s that housing and rental prices continue to rise. Since the housing crisis of
2008, housing prices have recovered remarkably well, especially in major housing
markets. However, in the 4th quarter of 2016, I was surprised to read that Bombay
housing prices had fallen the most in the last 4 years. In fact, median resale prices for
condos and coops fell 6.3%, marking the first time there was a decline since Q1 of
2017. The decline has been partly attributed to political uncertainty domestically and
abroad and the 2014 election. So, to maintain the transparency among customers and
also the comparison can be made easy through this model. If customer finds the price
of house at some given website higher than the price predicted by the model, so he can
6
Jay Gautam
20BCS4962
1
2. DATASET
7
Jay Gautam
20BCS4962
2
Data exploration is the first step in data analysis and typically involves summarizing
the main characteristics of a data set, including its size, accuracy, initial patterns in the
data and other attributes. It is commonly conducted by data analysts using visual
analytics tools, but it can also be done in more advanced statistical software, Python.
Before it can conduct analysis on data collected by multiple data sources and stored in
data warehouses, an organization must know how many cases are in a data set, what
variables are included, how many missing values there are and what general hypotheses
the data is likely to support. An initial exploration of the data set can help answer these
questions by familiarizing analysts with the data with which they are working.
visual elements like charts, graphs, and maps, data visualization tools provide an
accessible way to see and understand trends, outliers, and patterns in data. In the world
8
Jay Gautam
20BCS4962
of Big Data, data visualization tools and technologies are essential to analyse massive
Data selection is defined as the process of determining the appropriate data type
and source, as well as suitable instruments to collect data. Data selection precedes the
actual practice of data collection. This definition distinguishes data selection from
selective data reporting (selectively excluding data that is not supportive of a research
hypothesis) and interactive/active data selection (using collected data for monitoring
The primary objective of data selection is the determination of appropriate data type,
9
Jay Gautam
20BCS4962
4
This can be valuable both for making patterns in the data more interpretable and for
It is hard to discern a pattern in the upper panel whereas the strong relationship is
shown clearly in the lower panel. The comparison of the means of log-transformed data
the anti-log of the arithmetic mean of log-transformed values is the geometric mean.
10
Jay Gautam
20BCS4962
5
11
Jay Gautam
20BCS4962
6
3. SYSTEM ANALYSIS
3.1 ER Diagram
12
Jay Gautam
20BCS4962
7
13
Jay Gautam
20BCS4962
8
4. Design
The design of the web application involves the design of the forms for listing the area,
search for location, display the complete specification for the location, and design a
Design of an interactive application that enables the user to filter the products based on
different parameters.
14
Jay Gautam
20BCS4962
Design of an application that has features like drag and drop etc.
Design of application that decreases data transfers between the client and the server.
In this context diagram, the information provided to and received from the online house
the application. The closed boxes represent the set of sources and sinks of information.
5.1 HTML
HTML code ensures the proper formatting of text and images for your Internet
browser. Without HTML, a browser would not know how to display text as elements or
15
Jay Gautam
20BCS4962
load images or other elements. HTML also provides a basic structure of the page, upon
which Cascading Style Sheets are overlaid to change its appearance. One could think of
HTML as the bones (structure) of a web page, and CSS as its skin (appearance).
5.2 CSS
CSS is among the core languages of the open web and is standardized across Web
CSS specification was done synchronously, which allowed versioning of the latest
recommendations. You might have heard about CSS1, CSS2.1, CSS3. However, CSS4
well-known for the development of web pages; many non-browser environments also
5.4 JSX
template using JSX in React, but it is not a simple template language instead it comes
regular JavaScript. Instead of separating the markup and logic in separated files, React
16
Jay Gautam
20BCS4962
uses components for this purpose. We will learn about components in detail in further
articles.
5.5 Python
IPython is a powerful interactive shell that features easy editing and recording of a
The Software Carpentry Course teaches basic skills for scientific computing, running
Panda Scrapy
Numpy Scikit-learn
6. Modules used
independent variables.
It is mostly used for finding out the relationship between variables and forecasting.
17
Jay Gautam
20BCS4962
Real Vs Predicted
classification tasks with the use of multiple decision trees and a technique called
Bagging, in the Random Forest method, involves training each decision tree on a
The basic idea behind this is to combine multiple decision trees in determining the final
18
Jay Gautam
20BCS4962
Real Vs Predicted
The XG Boost library implements the gradient boosting decision tree algorithm.
Boosting is an ensemble technique where new models are added to correct the errors
Real Vs Predicted
19
Jay Gautam
20BCS4962
13
Linear Regression displayed the best performance for this Dataset and can be used for
deploying purposes.
Random Forest Regressor and XGBoost Regressor are far behind, so can’t be
20
Jay Gautam
20BCS4962
The Model is deployed through Python Web App Flask in collaboration with HTML
and CSS
14
8. Conclusion
So, our Aim is achieved as we have successfully ticked all our parameters as mentioned
in our Aim Column. It is seen that circle rate is the most effective attribute in predicting
the house price and that the Linear Regression is the most effective model for our
21
Jay Gautam
20BCS4962
22
Jay Gautam
20BCS4962