Professional Documents
Culture Documents
Project Synopsis
PREDICTION MCSP-232 (MCA_NEW)
Table of Contents
1. Project Title --------------------------------------------------------------------------------------------- 2
2. Introduction ------------------------------------------------------------------------------------------- 2
2.1. Objectives ------------------------------------------------------------------------------------ 3
3. Project category ------------------------------------------------------------------------------------ 4
4. Tools/Platform ------------------------------------------------------------------------------------ 4
4.1. Software platform --------------------------------------------------------------------------- 4
4.2. Hardware platform --------------------------------------------------------------------------- 4
4.3. Tools ---------------------------------------------------------------------------------------------- 4
5. Problem Statement ---------------------------------------------------------------------------------- 5
5.1. Project Description -------------------------------------------------------------------------- 5
6. Scope of Work ----------------------------------------------------------------------------------------- 7
6.1. Advantages of project ----------------------------------------------------------------------- 7
7. Analysis --------------------------------------------------------------------------------------------------- 7
7.1. DFD ----------------------------------------------------------------------------------------------- 8
7.2. Architectural Diagram ----------------------------------------------------------------------- 9
8. Overall Architecture -------------------------------------------------------------------------------- 10
i. Project Pipeline -------------------------------------------------------------------------- 10
ii. Data Cleaning ----------------------------------------------------------------------------- 10
iii. EDA ------------------------------------------------------------------------------------------ 11
iv. Feature Engineering -------------------------------------------------------------------- 12
v. Machine Learning ------------------------------------------------------------------- 12
vi. Model Selection ---------------------------------------------------------------------------- 13
vii. Training -------------------------------------------------------------------------------- 15
viii. Evaluation ------------------------------------------------------------------------------- 16
9. Limitations of the project --------------------------------------------------------------------------- 18
10. Further Enhancement -------------------------------------------------------------------------------- 19
10.1. Conclusion -------------------------------------------------------------------------------------- 19
11. Bibliography --------------------------------------------------------------------------------------------- 20
Page 1 of 20
IGNOU Version1.1 Project Synopsis
1. PROJECT TITLE
This project titled as “HOUSE PRICE PREDICTION” is a machine learning
based project that can be used to predict the most accurate price of a
property in a very efficient, secured and fastest manner.
2. INTRODUCTION
One of the basic requirements of livelihood in the recent world is to buy a
house of your own.
The price of house may depend on various sectors. Real estate agents and
manhole are involved in selling a house want a tag price on the house
which would be the real worth of buying the house. The prediction of the
price of the house is often very hard for the inexperienced.
House is a fundamental need to someone and them costs range from place
to place primarily based totally at the facilities to be had like parking area,
location, etc. The residence price is a factor that issues a lot of citizens
either wealthy or white-collar magnificence as possible by no means decide
or gauge the value of a residence primarily based totally on place or places
of work accessible. Purchasing a residence is the best and unique desire of
a own circle of relatives because this expends the whole thing of their
funding budget and every now and then cover them under loan. It is a hard
challenge for expecting the correct value of residence price. This proposed
version could make viable who are expecting the precise costs of house.
This model of Linear regression in machine learning takes the internal
factor of house valuation dependencies like area, no of bedrooms, locality
etc. and external factors like air pollution and crime rates.
Here in this project, we are going to machine learning to build a predictive
model for estimation of the house price for real estate customers. In this
project we are going to build the Machine learning model by using python
programming and other python tools like NumPy, pandas, matplotlib etc.
We are also using scikit-learn library in our approach of this project. For
going further into this project, we prepare dataset which consists of features
like no of bedrooms, area of house, locality, etc. After preparing the dataset
we will use 80 percentage of data for training the ml model and 20
percentage of data for testing the ml model.
Page 2 of 20
IGNOU Version1.1 Project Synopsis
Buying a house is one of the most important decisions for an individual. The
price of a house is dependent on wide factors which are its features such
as no of bedrooms, area of construction, locality of the house etc. and
many external factors such as air pollution and crime rate. These all factors
make the prediction of price of the house more complex. This house price
prediction is beneficial for many real estates. So, an easier and accurate
method is needed for the prediction of house price. This project is proposed
to predict the price of a house for real estate customers based on their
preferences like no of rooms, no of bedrooms, area of the house, locality
etc. This machine learning model to predict the house price will be very
helpful for the customers, because it is being the major issue for the people,
as they can never estimates the price of the house on the basis of locality
or other features available.
2.1. OBJECTIVES
This project provides us an overview on how to predict house prices using
various machine learning models with the help of different Python libraries.
This proposed model considers as the most accurate model used for
calculating the house price and provides the most accurate prediction. This
provides a brief introduction which will be needed to predict the house
price. This project consists of what and how the house price model works
with the assistance of machine learning technique using scikit - learn and
which data sets will be using in our proposed model.
Predicting the price of a house helps to determine the selling price of house
in a particular region and it help people to find correct time to buy a house.
In this task on house price prediction using machine learning, our task is to
use data to create a machine learning model to predict houses in the given
region. By using real world data entities, we are going to predict the price of
the house in that area.
Page 3 of 20
IGNOU Version1.1 Project Synopsis
3. PROJECT CATEGORY
Machine Learning (ML), is a type of artificial intelligence (AI) that allows
software applications to become more accurate at predicting outcomes
without being explicitly programmed to do so.
ML project is journey. This journey typically involves an agile process of
data discovery, feasibility study, building a minimum viable model (MVM)
and finally deploying that model into
prediction.
b) Hardware Platforms:-
i) Minimum 7th Generation Intel® Core i7 Processors.
ii) 1CB RAM.
iii) 512GB disk space.
iv) A minimum of 256 GB storage.
v) High Definition or FHD display.
c) Tools: -
(i) Language : - I used Python for the project. Jupyter notebooks
are popular among data scientist because they are easy to follow
and show your working steps.
Page 4 of 20
IGNOU Version1.1 Project Synopsis
(ii) Libraries: - There are frameworks in python to handle commonly
required tasks. I implore any budding data scientists to familiarise
themselves with these libraries :
a) Pandas : - for handling structured data
b) Scikit Learn : - For machine learning
c) Numpy :- For linear algebra and mathematics
d) Seaborn and Matplotlib :- For data visualisation
5. PROBLEM STATEMENT
The prices of real estate properties are sophisticatedly linked with our
economy. Despite this, we do not have accurate measures of house
prices based on the vast amount of data available.
Proper and justified prices of properties can bring in a lot of
transparency and trust back to real estate industry which is very
important for most customers especially in India.
Training
Deploying
Page 5 of 20
IGNOU Version1.1 Project Synopsis
Domain problem to ML problem: -
1. Domain Problem: In this project we are going to predict the price of
a house for real estate customers based on their preferences like
no of bedrooms, area of the house, locality etc.
2. ML problem: This problem can be solved using supervised
learning in ML. It is treated as a Regression problem as the target
is continuously valued.
Data collection:-In order to proceed further into this project, first of all
we should collect the data, after collecting a dataset, then we should
pre-process the data and then we do the exploratory data analysis on
the given data set.
6. SCOPE OF WORK
The scope of Machine Learning is not limited to the investment sector. Rather,
it is expanding across all fields such as banking and finance, information
technology, media & entertainment, gaming, and the automotive industry. As
the Machine Learning scope is very high, there are some areas where
researchers are working toward revolutionizing the world for the future.
7. ANALYSIS
The demand for commercial housing is generally divided into self-occupied
demand, investment demand, and speculative demand. Self-occupation
demand is self-occupation; investment demand is to buy commercial
housing and rent it out to obtain rental income; speculative demand buys
commercial housing in anticipation of rising house prices and sells it after
the price rises; and the purpose is to earn the price difference. The demand
for self-occupation and investment is the supporting force of the commercial
housing market. In particular, the demand for self-occupation has been
encouraged and supported by policies. In addition to driving up housing
prices, speculative demand squeezes out part of the demand for owner-
occupiers and blows up the housing market bubble, which is harmful to the
commercial housing market. This article combines the generalized linear
regression model to build a real estate price prediction model. Through the
simulation and comparison experiments, it can be seen that the housing
price forecasting system based on the generalized regression model
proposed in this article has a high housing price forecasting accuracy.
Page 7 of 20
IGNOU Version1.1 Project Synopsis
START
DATASET
MODEL DEPLOYMENT
Page 8 of 20
IGNOU Version1.1 Project Synopsis
Phase I:
Data in real time processes Data from Batch Processes
Phase II:
Data
Store
Phase III:
Data Feature
Transformation Model Engine
Generation
Phase IV:
Performance
Model Output
Tuning
Page 9 of 20
IGNOU Version1.1 Project Synopsis
8. OVERALL ARCHITECTURE:-
i. Project Pipeline :- Machine learning project follows the same
process : -
Data ingestion
Data cleaning
Exploratory data analysis
Feature engineering
Machine Learning
The pipeline is not linear and we have to jump back and forth
between stages.
ii. Data Cleaning: -
The data we collect are from different sources such as internet,
industry papers, etc. It is quite possible that we get the data in a
raw form. This raw data consists of several errors, unnecessary
data etc. and we cannot give this kind of data directly to our ML
Model as this will affect its accuracy. So we have to clean the data
to make it accessible to ML model. There are many kinds of
mistakes/errors in a data. Some of these are: -
Duplicates
Nan’s etc.
Page 10 of 20
IGNOU Version1.1 Project Synopsis
Page 11 of 20
IGNOU Version1.1 Project Synopsis
v. Machine Learning: -
We follow a standard development cycle for machine
learning. Basically, we have to go through many iterations of
the cycle before we are able to get my model working to a
high standard.
Page 12 of 20
IGNOU Version1.1 Project Synopsis
Select Machine
Evaluate Results
Learning Model
Training : Tune
Hyperparameters Set Hyper
with grid search parameter ranges
and cross for grid search
validation
Set number of
cross folds for
cross validations
Page 13 of 20
IGNOU Version1.1 Project Synopsis
Page 14 of 20
IGNOU Version1.1 Project Synopsis
vii. Training: -
In machine learning, “training” refers to the process of teaching our
model using examples from our training data set. In the training stage,
we’ll tune our model hyperparameters.
Hyperparameters: - Hyper parameters helps us adjust the
complexity of our model. There are some best practices on
what hyperparameters one should tune for each of the
models. Some of hyperparameters are: -
o max_depth — The maximum number of nodes for a
given decision tree.
Page 15 of 20
IGNOU Version1.1 Project Synopsis
viii. Evaluation: -
This is the last step in the process. This is the decisive step of the
whole process. We can use data visualisation to see the results of
each of our candidate models. If we are not happy with our end
results, we have to revisit our process at any of the stages from
data cleaning to machine learning. Our performance metric will
be the negative root mean squared error (NRMSE).
Page 16 of 20
IGNOU Version1.1 Project Synopsis
Page 17 of 20
IGNOU Version1.1 Project Synopsis
Page 18 of 20
IGNOU Version1.1 Project Synopsis
10.1. CONCLUSION
Buying your own house is what every human wish for. Using this
proposed model, we want people to buy houses and real estate at
their rightful prices and want to ensure that they don't get tricked
by sketchy agents who just are after their money. Additionally, this
model will also help Big companies by giving accurate predictions
for them to set the pricing and save them from a lot of hassle and
save a lot of precious time and money. Correct real estate prices
are the essence of the market and we want to ensure that by using
this model. The system is apt enough in training itself and in
predicting the prices from the raw data provided to it. After going
through several research papers and numerous blogs and articles,
a set of algorithms were selected which were suitable in applying
on both the datasets of the model. Predicting the prices of different
houses with various features and was able to handle large sums of
data. The system is quite user-friendly and time-saving.
Page 19 of 20
IGNOU Version1.1 Project Synopsis
11. BIBLIOGRAPHY
a. Python for Data Analysis book by O’Rielly
b. Hands on ML with scikit learn,Keras & Tensorflow by
O’Rielly
c. Machine Learning Yearning book by Andrew NG
d. Jain, N.., Kalra, P., &Mehrotra, D. (2020). Analysis of
Factoring Affecting Infant Mortality Rates Using
Decisions Tree . In Soft Computing: Theories and
Applications (pp. 639-656).Springers, Singapore.
e. Malhotra, R., & Sharma, A. (2018). Analyze Machine
Learning Techniques for Fault Prediction Using Web
Applications. Journal of Information Processing Systems
Page 20 of 20