You are on page 1of 26

BHARATI VIDYAPEETH’S COLLEGE OF

ENGINEERING, KOLHAPUR
BACHELOR OF TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE AND
ENGINEERING

A REPORT

ON

“HOUSE PRICE PREDICTION SYSTEM”


SUBMITTED BY
1) Mr. Soham Narayan Bhosale 03
2) Mr. Vishwavardhan Sunil Chougule 04
3) Mr. Nikhil Abhiman Jadhav 11
4) Mr. Ganesh Vitthal Karande 16
5) Mr. Shahid Munir Mangole 18

UNDER THE GUIDANCE OF


Mrs. S.M.Mulla

Year 2021-22
CERTIFICATE

This is to certify that Mr. Soham Narayan Bhosale, Mr. Vishwardhan Sunil
Chougule, Mr.Nikhil Abhiman Jadhav, Mr.Ganesh Vitthal Karande and
Mr.Shahid Munir Mangole of CSE (ComputerScience Engineering) have
submitted project report entitled “House Price Prediction System. ”as per rule of
Shivaji university Kolhapur, in year 2021-2022.this project report is the record of
students own work carried out by them under the supervision & guidance in
satisfactory manner.
.

Mrs.S.M.Mulla Mrs. S. M. Mulla Dr. V. R. Ghorpade

Guide H.O.D Principal


DECLARATION

We undersigned that Mr. Soham Narayan Bhosale, Mr. Vishwavardhan


Sunil Chougule, Mr.Nikhil Abhiman Jadhav, Mr.Ganesh Vitthal Karande and
Mr.Shahid Munir Mangole of CSE (ComputerScience Engineering) declare that
We have done the project on “House Price Prediction System. been done by our
team under the guidance of Mrs.S.M.Mulla in partial fulfillment of Engineering
program during academic year 2021-22. All the data represented in this project is
true and correct to the best of our knowledge and belief.

We also declare the this project report is our team preparation and not
copied from anywhere else.

Place:- Kolhapur

Date:-

Roll Name Signature


No.
03 Mr.Soham Narayan Bhosale
04 Mr. Vishwavardhan Sunil Chougule
11 Mr.Nikhil Abhiman Jadhav
16 Mr.Ganesh Vitthal Karande
18 Mr.Shahid Munir Mangole
ACKNOWLEDGEMENT

We express our deep sincere to our project guide Mrs.S.M.Mulla. under


whose able guidance the whole project work was carried out.

We would also like to thanks to all other teacher of CSE DEPARTMENT


for their valuable guidance & timely suggestions, which helped us throughout the
project.

We also express our gratitude & thanks to all staff members & colleagues
& all those who were directly or indirectly involved in our project.

• Mr.Soham Narayan Bhosale


• Mr.Vishwavardhan Sunil Chougule
• Mr.Nikhil Abhiman Jadhav
• Mr.Ganesh Vitthal Karande
• Mr.Shahid Munir Mangole
INDEX

SR NO. TITLE PAGE


NO.
1. INTRODUCTION 01

1.1 PROBLEM DESCIPTION 02

1.2 OBJECTIVE 03

2. EXISTING SYSTEM 04

2.1 DESCRIPTION 04

2.2 DRAWBACKS 04

3. SYSTEM ARCHITECTURE 05

3.1 DATA FLOW DIAGRAM 06

3.2 DESCRIPTION WITH MODULES 07

4. REQUIREMENT ANALYSIS 09

4.1 SOFTWARE REQUIREMENTS 09

4.2 HARDWARE REQUIREMENTS 09

5. IMPLEMENTATION DETAILS 10
5.1 ALGORITHM 10

5.2 METHDOLOGY 11

6. RESULT 12

7. CONCLUSION 18

8. REFERENCES 19
House Price Prediction System

1. INTRODUCTION
Machine learning is a subfield of Artificial Intelligence (AI) that works with algorithms
and technologies to extract useful information from data. Machine learning methods are
appropriate in big data since attempting to manually process vast volumes of data would be
impossible without the support of machines. Machine learning in computer science attempts to
solve problems algorithmically rather than purely mathematically. Therefore, it is based on
creating algorithms that permit the machine to learn. However, there are two general groups in
machine learning which are supervised and unsupervised. Supervised is where the program
gets trained on pre-determined set to be able to predict when a new data is given. Unsupervised
is where the program tries to find the relationship and the hidden pattern between the data .

Several Machine Learning algorithms are used to solve problems in the real world
today. However, some of them give better performance in certain circumstances, as stated in
the No Free Lunch Theorem . Thus, this thesis attempts to use regression algorithms and
artificial neural network (ANN) to compare their performance when it comes to predicting
values of a given dataset.

The performance will be measured upon predicting house prices since the prediction in
many regression algorithms relies not only on a specific feature but on an unknown number of
attributes that result in the value to be predicted. House prices depend on an individual house
specification. Houses have a variant number of features that may not have the same cost due to
its location. For instance, a big house may have a higher price if it is located in desirable rich
area than being placed in a poor neighbourhood

CSE, BVCOEK 1
House Price Prediction System

1.1 Problem Description

The problem with some of the current system is that:

Based on observations, some people know about house prices of particular areas and a lot of
people are unaware of house prices. There is not a proper web- based application which can
fulfill a user’s demand of knowing the house price of any particular area. This is a limitation
that gives them capability to store house prices, but at the same time people try to maintain
secrecy in house prices, so people try to make phone calls in order to know prices of the area.

These types of problems can be overcome by switching to the web-based application.

They also make use of phone calls which are also limited to many features as compared to a
web base system. For example, a customer may make a phone call to an agent for a particular
house price, but there might be a chance of not knowing the exact house price.

CSE, BVCOEK 2
House Price Prediction System

1.2 Objective

The main objectives of this project are:

 To develop a web-based system that will help people to predict house prices.
 To help in advertising the house price prediction and more services of a company,
through the availability of the system online.

CSE, BVCOEK 3
House Price Prediction System

2. EXISTING SYSTEM
2.1 Description

The resulting data is fed into a machine learning model. To find the optimal procedure and
parameters for the model, we will mostly employ K-fold Cross-Validation and the
GridSearchCV approach.
It turns out that the linear regression model produces thebest results for our data, with
score of more than 80%, which is not terrible.
Now, we need to export our model as a pickle file(Bengaluru_House_Data.pickle),
which transforms Python objects into a character stream. Also, in order to interactwith
the locations(columns) from the frontend, we must export them into a JSON
(columns.json) file

2.2 Advantages and Drawbacks


Advantages
1. Saves time and cost.
2. The House Price Prediction system gives an easy access to the clients.
3. The client can make further arrangements for a house cost
4. It lessens the practice of paper as all the transactions, invoicing, etc. can be done
online.

Drawbacks
1. Details are stored in papers.
2. maintenance is a huge problem.
3. Updating, changes in details in tedious tasks.
4. Performance is not achieved up to the requirements.

CSE, BVCOEK 4
House Price Prediction System

3. SYSTEM ARCHITECTURE

Fig (1)

CSE, BVCOEK 5
House Price Prediction System

3.1 Data Flow Diagram

Fig (2)

CSE, BVCOEK 6
House Price Prediction System

3.2 Algorithm Use

Linear Regression

Linear regression is one of the easiest and most popular Machine Learning algorithms. It is a
statistical method that is used for predictive analysis. Linear regression makes predictions for
continuous/real or numeric variables such as sales, salary, age, product price, etc.

Linear regression algorithm shows a linear relationship between a dependent (y) and one or
more independent (y) variables, hence called as linear regression. Since linear regression
shows the linear relationship, which means it finds how the value of the dependent variable is
changing according to the value of the independent variable.

Lasso Regression

It is similar to the Ridge Regression except that the penalty term contains only the absolute
weights instead of a square of weights. Since it takes absolute values, hence, it can shrink the
slope to 0, whereas Ridge Regression can only shrink it near to 0. It is also called as L1
regularization.

Ridge Regression

Ridge regression is one of the types of linear regression in which a small amount of bias is
introduced so that we can get better long-term predictions.

Ridge regression is a regularization technique, which is used to reduce the complexity of the
model. It is also called as L2 regularization.

In this technique, the cost function is altered by adding the penalty term to it. The amount of
bias added to the model is called Ridge Regression penalty. We can calculate it by multiplying
with the lambda to the squared weight of each individual feature.

XGBoost

This is an AI method utilized in classification and regression assignments, among others. It


gives an expectation model as a troupe of feeble forecast models, commonly called decision
trees.

CSE, BVCOEK 7
House Price Prediction System

Random Forest Regressor

Random Forest is a popular machine learning algorithm that belongs to the supervised learning
technique. It can be used for both Classification and Regression problems in ML. It is based
on the concept of ensemble learning, which is a process of combining multiple classifiers to
solve a complex problem and to improve the performance of the model.

As the name suggests, "Random Forest is a classifier that contains a number of decision trees
on various subsets of the given dataset and takes the average to improve the predictive accuracy
of that dataset." Instead of relying on one decision tree, the random forest takes the prediction
from each tree and based on the majority votes of predictions, and it predicts the final output.

The greater number of trees in the forest leads to higher accuracy and prevents the problem of
overfitting.

Support Vector Machine

Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms,
which is used for Classification as well as Regression problems. However, primarily, it is used
for Classification problems in Machine Learning.

The goal of the SVM algorithm is to create the best line or decision boundary that can segregate
n-dimensional space into classes so that we can easily put the new data point in the correct
category in the future. This best decision boundary is called a hyperplane.

CSE, BVCOEK 8
House Price Prediction System

4. REQUIREMENT ANALYSIS:-( H/W & S/W)


4.1 Software Requirements
Operating System: Windows 10 home and above version
Language: Python, HTML
Source Code Editor: Google Collab, Visual studio code, Jupiter Notebook
Graphics card 1080 TI OC
Machine learning
4.2 Hardware Requirements
Processor: 11th Gen Intel(R) Core (TM) i5-1135G7 @ 2.40GHz 2.42 GHz
RAM: 4GB & Above
Hard disk: 512 GB SSD

CSE, BVCOEK 9
House Price Prediction System

5. IMPLEMENTATION DETAILS

5.1 Algorithm

Step 1-Start

Step 2: Select the no of Bedrooms.


Step 3: Select the no of Balcony.
Step 4: Select the total sq. foot.
Step 5: Select area type.
Step 6: Select house location.

Step 7: then press the button “predict house price”.

Step 8: House price will be predicted.

Step 9: Press enter to exit.

CSE, BVCOEK 10
House Price Prediction System

5.2 Methodology
The project is done to pre-process the data and evaluate the prediction accuracy of the
models. The experiment has multiple stages that are required to get the prediction results.
These stages can be defined as: -
Pre-processing: both datasets will be checked and pre-processed using the methods These
methods have various ways of handling data. Thus, the pre-processing is done on multiple
iterations where each time the accuracy will be evaluated with the used combination.
Data splitting: dividing the dataset into two parts is essential to train the model with one and
use the other in the evaluation. The dataset will be split 75% for training and 25% for testing.
- Evaluation: the accuracy of both datasets will be evaluated by measuring the R2 and RMSE
rate when training the model alongside an evaluation of the actual prices on the test dataset
with the prices that are being predicted by the model. 10
Performance: alongside the evaluation metrics, the required time to train the model will be
measured to show the algorithm vary in terms of time.
Correlation: correlation between the available features and house price will be evaluated
using the Pearson Coefficient Correlation to identify whether the features have a negative,
positive or zero correlation with the house price.

CSE, BVCOEK 11
House Price Prediction System

6. RESULT

This program provides user a predicted price of the house. It gives the user an idea about
pricing of the house who are totally unaware about them. It provides a direct numerical value
prediction price of the house which helps the user to understand it easily.
It provides a very simple output, in rupee format. It helps many users to think about their
construction projects based on the price. Also helps in selling a house.
The user interface of the program is very user-friendly and simple to use and understand.
This model is 98% efficient to guess the price of the house.
It predicts the nearest correct price of the house based on the dataset it was provided with.
The following are the basic functionalities of the House Price Prediction Project in python
with Source code.

Main Page:
When When you start the project from any compiler or by clicking on the executable App.py
file , you will see the image shown below which is the main page for taking the user input
and for predicting the price of the house.

User Input:
In the below image the user inputs the data in the specified blocks needed for predicting the
house price.

CSE, BVCOEK 12
House Price Prediction System

View Predicted Price:


Here, as shown in the below image the user gets to view the predicted price of the house as the
output of the project.

Heatmap
A heatmap is a two-dimensional graphical representation of data where the individual values
that are contained in a matrix are represented as colours.
The Seaborn package allows the creation of annotated heatmaps which can be tweaked using
Matplotlib tools as per the creator’s requirement.

CSE, BVCOEK 13
House Price Prediction System

Fig (3)

Histogram
A histogram is a graphical representation of data points organized into user-specified ranges. Similar
in appearance to a bar graph, the histogram condenses a data series into an easily interpreted visual by
taking many data points and grouping them into logical ranges or bins.

CSE, BVCOEK 14
House Price Prediction System

Fig (4)

Scatter Plot
The Matplotlib module has a method for drawing scatter plots, it needs two arrays of the
same length, one for the values of the x-axis, and one for the values of the y-axis:

Fig (5)

CSE, BVCOEK 15
House Price Prediction System

SubPlot

Subplots mean groups of axes that can exist in a single matplotlib figure. subplots()
function in the matplotlib library, helps in creating multiple layouts of subplots. It
provides control over all the individual plots that are created.

Fig (6)

Pairplot
Pairplot visualization comes handy when you want to go for Exploratory data analysis
(“EDA”).
Pairplot visualizes given data to find the relationship between them where the variables can be
continuous or categorical.

Fig (7)

CSE, BVCOEK 16
House Price Prediction System

Implemented Schedule
PERIOD WORK TO BE COMPLETED

1st Week Created a group and planned about


project.Decided the Subject to work on .

2nd Week Requirement analysis of the data and the


user which were used as the software.

3rd Week Designing all modules and deciding the


flow of the system.

4th Week The Coding Part of all the modules were


here.

5th Week All the testing of data and validation of


modules were done.We made sure that
system is providing accurate result
according to requirements.

CSE, BVCOEK 17
House Price Prediction System

7. CONCLUSION

Thus,we have completed our project entitled “House Price Prediction” by Using Machine
Learning with Python.

For this project we have studied how to download dataset and clean data and also Train model
using machine learning algorithms.

This project is very useful for us to develop logical thinking and knowledge in software
engineering.

CSE, BVCOEK 18
House Price Prediction System

8. REFERENCES

 Kristianstad University, Sweden Report


 Websites:

www.google.com

www.w3schools.com

www.geeksforgeeks.com

www.programiz.com

CSE, BVCOEK 19
House Price Prediction System

Name of the Project group members E-mail id Phone No. Signature

Mr. Soham Narayan Bhosale bhosalesoham7206@gmail.com 9579460160

Mr. Vishwavardhan Sunil Chougule vishuchougle0512@gmail.com 8080568410

Mr. Nikhil Abhiman Jadhav njadhav2624@gmail.com 8208807478

Mr. Ganesh Vitthal Karande karandeganesh049@gmail.com 7218484503

Mr. Shahid Munir Mangole shahidmmangole7864@gmail.com 7745017796

Mrs. S.M.Mulla Mrs.S.M.Mulla


PROJECT GUIDE H.O.D-C.S.E

Dr.V.R.Ghorpad
EXTERNAL EXAMINER PRINCIPAL

CSE, BVCOEK 20

You might also like