You are on page 1of 28

Exploratory Data Analysis (EDA)

Project TITLE :

Location-Based Flat Rate Data Analysis in


Bangalore: Uncovering Real Estate Insights"

Abstract
The real estate market is a dynamic and complex industry where property prices vary significantly based on
location. This analysis aims to explore the relationship between property types (flats, apartments, and villas)
and their prices in different locations. By analyzing historical property data, we can gain insights into how
location affects property prices and help potential buyers, sellers, and investors make informed decisions.

Introduction
In the real estate market, property prices are influenced by a multitude of factors, with location being a
primary determinant. This analysis focuses on examining the correlation between property types and their
prices across different regions using data sourced from the Makaan website.

Importing Required Libraries to perfrom web


Scarping and Data Analysis
In [2]: from bs4 import BeautifulSoup
from scipy import stats
import numpy as np
import re
import pandas as pd
import numpy as np
import requests
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

Understand Website Structure : MAKAAN


In [8]: url='https://www.makaan.com/bangalore-residential-property/buy-property-in-bangalore-cit

In [9]: page=requests.get(url)

In [10]: page
Loading [MathJax]/extensions/Safe.js
<Response [200]>
Out[10]:

In [11]: page.status_code

200
Out[11]:

In [12]: soup=BeautifulSoup(page.text)

Perform Web Scraping


In [13]: #Location = ['Koramangala' ,'Bellandur' ,'Indira Nagar','Elecronic City' ,'HSR Layout' ,
Name_of_the_flat = []
Location = []
Type_of_House=[]
Type_of_BHK = []
Area_in_sq_ft = []
Construction_Status = []
Facing = []
#Nearby = []
New_or_Resale = []
Property_Price = []
pins=[50317 ,50270 ,50162 ,60730 ,50387 ,51641 ,50159 ,50160 ,50174 ,51512 ,50181]
for j in pins:
for i in range(1,11):
url=f"https://www.makaan.com/bangalore-property/malleswaram-flats-for-sale-{j}?p
page=requests.get(url)
soup=BeautifulSoup(page.text)
container=soup.find_all("li",class_="cardholder")
for i in container:
a=i.find("td",class_="price")
if a:
Property_Price.append(a.text)
else:
Property_Price.append(np.nan)
for i in container:
i.find("div",class_="title-line-wrap")
reg=re.findall("(\d\sBHK)",i.text)
if reg:
Type_of_BHK.append(reg[0])
else:
Type_of_BHK.append(np.nan)
for i in container:
i.find("div",class_='title-line-wrap')
regex=re.findall("((?:Koramangala|Bellandur|Indira Nagar|Electronics City|HS
if regex:
Location.append(regex[0])
else:
Location.append(np.nan)

for i in container:
a=i.find('td',class_='size')
if a:
Area_in_sq_ft.append(a.text)
else:
Area_in_sq_ft.append(np.nan)

for i in container:
a=i.find('td',class_='val')
if a:
Construction_Status.append(a.text)
Loading [MathJax]/extensions/Safe.js else:
Construction_Status.append(np.nan)

for i in container:
i.find("ul",class_="listing-details")
reg=re.findall("(\w+\sfacing)",i.text)
if reg:
Facing.append(reg[0])
else:
Facing.append(np.nan)

for i in container:
a=i.find('li',class_='keypoint')
if a:
New_or_Resale.append(a.text)
else:
New_or_Resale.append(np.nan)

for i in container:
i.find("div",class_="title-line-wrap")
reg=re.findall("((?:Apartment|Villa|Independent|Flat))",i.text)
if reg:
Type_of_House.append(reg[0])
else:
Type_of_House.append(np.nan)

checking the lengths for each variable


In [14]: print(len(Facing))
print(len(New_or_Resale))
print(len(Type_of_House))
print(len(Type_of_House))
print(len(Construction_Status))
print(len(Area_in_sq_ft))
print(len(Location))
print(len(Property_Price))
print(len(Type_of_BHK))

1658
1658
1658
1658
1658
1658
1658
1658
1658

Create a DataFrame from the scraped data

Loading [MathJax]/extensions/Safe.js
In [15]: Flats_Data = {
'Type of House' : Type_of_House ,
'Type of BHK' : Type_of_BHK ,
'Location' : Location ,
'Area in sq.ft' : Area_in_sq_ft,
'Construction Status' : Construction_Status,
'Facing' : Facing ,
'New_or_Resale' : New_or_Resale ,
'PropertyPrice_in_lakhs' : Property_Price ,
}
df= pd.DataFrame(Flats_Data)

In [74]: df

Out[74]: Type of
Type Area
Construction
of Location in Facing New_or_Resale PropertyPrice_in_lakhs Pr
House Status
BHK sq.ft

2 Electronics Ready to North


197 Apartment 550 1 years ago 15
BHK City move facing

1 Ready to East
383 Apartment Ulsoor 618 1 years ago 40
BHK move facing

2 Electronics Ready to East


288 Apartment 625 2 years ago 18
BHK City move facing

1 Electronics Ready to East


187 Apartment 630 1 years ago 21
BHK City move facing

1 Electronics Under North


284 Apartment 640 1 years ago 22
BHK City Construction facing

... ... ... ... ... ... ... ... ...

0 Electronics Ready to East


182 Independent 9000 9 years ago 155
BHK City move facing

5 Ready to NorthEast
44 Independent Koramangala 9000 6 years ago 1500
BHK move facing

9 Ready to East
128 Independent Indira Nagar 9600 9 years ago 850
BHK move facing

0 Ready to NorthEast
59 Independent Koramangala 10000 9 years ago 1000
BHK move facing

0 Ready to NorthEast
60 Independent Koramangala 15000 9 years ago 7000
BHK move facing

427 rows × 10 columns

In [17]: len(df)

480
Out[17]:

Export the DataFrame into CSV or EXCEL file


In [19]: df.to_csv("REALESTATE_PROJECT.csv" ,index=None )

In [20]: df=pd.read_csv("REALESTATE_PROJECT.csv")

DATA Cleaning
Loading [MathJax]/extensions/Safe.js
In [21]: df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 480 entries, 0 to 479
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Type of House 480 non-null object
1 Type of BHK 480 non-null object
2 Location 480 non-null object
3 Area in sq.ft 480 non-null int64
4 Construction Status 480 non-null object
5 Facing 480 non-null object
6 New_or_Resale 480 non-null object
7 PropertyPrice_in_lakhs 480 non-null object
dtypes: int64(1), object(7)
memory usage: 30.1+ KB

In [22]: names_of_flats = [
'Sai Towers', 'Bhrundavan Gardens', 'Hanuman Enclaves', 'Prestige Park Grove',
'Prestige Eden Park', 'Prestige Meridian Park', 'Prestige Avalon Park',
'Prestige Aston Park', 'The Prestige City', 'Mahindra Eden',
'Brigade Komarla Heights', 'Prestige Green Gables'
]

In [ ]:

In [23]: flat_names_repeated = [names_of_flats[i % len(names_of_flats)] for i in range(517)]

Renaming The Facing Values


In [25]: df["Facing"].replace({"east facing":"East facing","south facing":"South facing","west fa

Identifing The Null Values


In [26]: df.isnull().sum()

Type of House 0
Out[26]:
Type of BHK 0
Location 0
Area in sq.ft 0
Construction Status 0
Facing 0
New_or_Resale 0
PropertyPrice_in_lakhs 0
dtype: int64

In [27]: df.info()

Loading [MathJax]/extensions/Safe.js
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 480 entries, 0 to 479
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Type of House 480 non-null object
1 Type of BHK 480 non-null object
2 Location 480 non-null object
3 Area in sq.ft 480 non-null int64
4 Construction Status 480 non-null object
5 Facing 480 non-null object
6 New_or_Resale 480 non-null object
7 PropertyPrice_in_lakhs 480 non-null object
dtypes: int64(1), object(7)
memory usage: 30.1+ KB

Replacing The NUll Values With sum of values in


the column
In [28]: type(df.Location.isnull().sum())

numpy.int64
Out[28]:

Converting The Data types


Catogorical data to Object Type , Numnerical Data to Int Type

In [29]: type(df['Type of House'].isnull().sum())

numpy.int64
Out[29]:

In [30]: type(df['Type of BHK'].isnull().sum())

numpy.int64
Out[30]:

In [31]: type(df['Area in sq.ft'].isnull().sum())

numpy.int64
Out[31]:

In [32]: type(df['Construction Status'].isnull().sum())

numpy.int64
Out[32]:

In [33]: type(df['Facing'].isnull().sum())

numpy.int64
Out[33]:

In [34]: type(df['New_or_Resale'].isnull().sum())

numpy.int64
Out[34]:

In [35]: type(df['PropertyPrice_in_lakhs'].isnull().sum())

numpy.int64
Out[35]:
Loading [MathJax]/extensions/Safe.js
Data Inspection
In [36]: df=df[(df['Facing']!='is facing')]
df.reset_index(drop=True ,inplace =True)

In [ ]:

In [37]: df=df[(df['Facing']!='is facing')]


df.reset_index(drop=True , inplace = True)

In [38]: df

Out[38]: Type of
Type Area
Construction
of Location in Facing New_or_Resale PropertyPrice_in_lakhs
House Status
BHK sq.ft

3 Ready to East
0 Independent Koramangala 2000 3 Bathrooms 2.75 Cr
BHK move facing

0 Ready to South
1 Villa Koramangala 6000 9 Bathrooms 4 Cr
BHK move facing

3 Ready to East
2 Apartment Koramangala 1945 3 Bathrooms 2 Cr
BHK move facing

4 Ready to NorthEast
3 Apartment Koramangala 4000 4 Bathrooms 10.5 Cr
BHK move facing

4 Ready to East
4 Independent Koramangala 2800 4 Bathrooms 2.6 Cr
BHK move facing

... ... ... ... ... ... ... ... ...

3 Under NorthEast
422 Apartment Devanahalli 1150 2 Bathrooms 50 L
BHK Construction facing

4 Under East
423 Villa Devanahalli 4400 6 Bathrooms 5.8 Cr
BHK Construction facing

2 Under East
424 Apartment Devanahalli 1050 2 Bathrooms 62 L
BHK Construction facing

4 Under East Possession by


425 Villa Devanahalli 2400 1.4 Cr
BHK Construction facing Mar 2024

3 Ready to northEast
426 Apartment Devanahalli 1275 4 - 5 years old 1.46 Cr
BHK move facing

427 rows × 8 columns

In [39]: df["PropertyPrice_in_lakhs"]

0 2.75 Cr
Out[39]:
1 4 Cr
2 2 Cr
3 10.5 Cr
4 2.6 Cr
...
422 50 L
423 5.8 Cr
424 62 L
425 1.4 Cr
426 1.46 Cr
Name: PropertyPrice_in_lakhs, Length: 427, dtype: object

Loading [MathJax]/extensions/Safe.js
In [40]: df["PropertyPrice_in_lakhs"]= df["PropertyPrice_in_lakhs"].replace({"L":"*1","Cr":"*1e2"

In [41]: PriceRange=[]
for i in df["PropertyPrice_in_lakhs"]:
if i<150:
PriceRange.append("Low")
elif (i>150) & (i<350):
PriceRange.append("Medium")
elif(i>350) & (i<500):
PriceRange.append("High")
else:
PriceRange.append("Very High")
df["PriceRange"]=PriceRange

In [42]: df

Out[42]: Type of
Type Area
Construction
of Location in Facing New_or_Resale PropertyPrice_in_lakhs Pric
House Status
BHK sq.ft

3 Ready to East
0 Independent Koramangala 2000 3 Bathrooms 275
BHK move facing

0 Ready to South
1 Villa Koramangala 6000 9 Bathrooms 400
BHK move facing

3 Ready to East
2 Apartment Koramangala 1945 3 Bathrooms 200
BHK move facing

4 Ready to NorthEast
3 Apartment Koramangala 4000 4 Bathrooms 1050 V
BHK move facing

4 Ready to East
4 Independent Koramangala 2800 4 Bathrooms 260
BHK move facing

... ... ... ... ... ... ... ... ...

3 Under NorthEast
422 Apartment Devanahalli 1150 2 Bathrooms 50
BHK Construction facing

4 Under East
423 Villa Devanahalli 4400 6 Bathrooms 580 V
BHK Construction facing

2 Under East
424 Apartment Devanahalli 1050 2 Bathrooms 62
BHK Construction facing

4 Under East Possession by


425 Villa Devanahalli 2400 140
BHK Construction facing Mar 2024

3 Ready to northEast
426 Apartment Devanahalli 1275 4 - 5 years old 146
BHK move facing

427 rows × 9 columns

In [43]: df.to_csv("project_cleaned_data.csv")

In [44]: df

Loading [MathJax]/extensions/Safe.js
Out[44]: Type of
Type Area
Construction
of Location in Facing New_or_Resale PropertyPrice_in_lakhs Pric
House Status
BHK sq.ft

3 Ready to East
0 Independent Koramangala 2000 3 Bathrooms 275
BHK move facing

0 Ready to South
1 Villa Koramangala 6000 9 Bathrooms 400
BHK move facing

3 Ready to East
2 Apartment Koramangala 1945 3 Bathrooms 200
BHK move facing

4 Ready to NorthEast
3 Apartment Koramangala 4000 4 Bathrooms 1050 V
BHK move facing

4 Ready to East
4 Independent Koramangala 2800 4 Bathrooms 260
BHK move facing

... ... ... ... ... ... ... ... ...

3 Under NorthEast
422 Apartment Devanahalli 1150 2 Bathrooms 50
BHK Construction facing

4 Under East
423 Villa Devanahalli 4400 6 Bathrooms 580 V
BHK Construction facing

2 Under East
424 Apartment Devanahalli 1050 2 Bathrooms 62
BHK Construction facing

4 Under East Possession by


425 Villa Devanahalli 2400 140
BHK Construction facing Mar 2024

3 Ready to northEast
426 Apartment Devanahalli 1275 4 - 5 years old 146
BHK move facing

427 rows × 9 columns

In [45]: df.New_or_Resale

0 3 Bathrooms
Out[45]:
1 9 Bathrooms
2 3 Bathrooms
3 4 Bathrooms
4 4 Bathrooms
...
422 2 Bathrooms
423 6 Bathrooms
424 2 Bathrooms
425 Possession by Mar 2024
426 4 - 5 years old
Name: New_or_Resale, Length: 427, dtype: object

In [46]: df['New_or_Resale'] = df['New_or_Resale'].str.replace('Bathrooms', 'years ago')

In [47]: df

Loading [MathJax]/extensions/Safe.js
Out[47]: Type of
Type Area
Construction
of Location in Facing New_or_Resale PropertyPrice_in_lakhs Pric
House Status
BHK sq.ft

3 Ready to East
0 Independent Koramangala 2000 3 years ago 275
BHK move facing

0 Ready to South
1 Villa Koramangala 6000 9 years ago 400
BHK move facing

3 Ready to East
2 Apartment Koramangala 1945 3 years ago 200
BHK move facing

4 Ready to NorthEast
3 Apartment Koramangala 4000 4 years ago 1050 V
BHK move facing

4 Ready to East
4 Independent Koramangala 2800 4 years ago 260
BHK move facing

... ... ... ... ... ... ... ... ...

3 Under NorthEast
422 Apartment Devanahalli 1150 2 years ago 50
BHK Construction facing

4 Under East
423 Villa Devanahalli 4400 6 years ago 580 V
BHK Construction facing

2 Under East
424 Apartment Devanahalli 1050 2 years ago 62
BHK Construction facing

4 Under East Possession by


425 Villa Devanahalli 2400 140
BHK Construction facing Mar 2024

3 Ready to northEast
426 Apartment Devanahalli 1275 4 - 5 years old 146
BHK move facing

427 rows × 9 columns

Summary Statistics
In [76]: # Calculate mean
mean_price = df['PropertyPrice_in_lakhs'].mean()

# Calculate median
median_price = df['PropertyPrice_in_lakhs'].median()

# Calculate mode (if it exists, can be multiple modes)


mode_price = df['PropertyPrice_in_lakhs'].mode()

# Calculate standard deviation


std_deviation_price = df['PropertyPrice_in_lakhs'].std()

# Calculate percentiles (e.g., 25th, 50th, and 75th percentiles)


percentiles = df['PropertyPrice_in_lakhs'].quantile([0.25, 0.50, 0.75])

print(f"Mean: {mean_price}")
print(f"Median: {median_price}")
print(f"Mode: {mode_price}")
print(f"Standard Deviation: {std_deviation_price}")
print(f"25th Percentile: {percentiles[0.25]}")
print(f"50th Percentile (Median): {percentiles[0.50]}")
print(f"75th Percentile: {percentiles[0.75]}")

Loading [MathJax]/extensions/Safe.js
Mean: 202.423887587822
Median: 92.0
Mode: 0 40
Name: PropertyPrice_in_lakhs, dtype: int32
Standard Deviation: 400.46512944016393
25th Percentile: 41.5
50th Percentile (Median): 92.0
75th Percentile: 246.0

In [77]: # Calculate mean


mean_area = df['Area in sq.ft'].mean()

# Calculate median
median_area = df['Area in sq.ft'].median()

# Calculate mode (if it exists, can be multiple modes)


mode_area = df['Area in sq.ft'].mode()

# Calculate standard deviation


std_deviation_area = df['Area in sq.ft'].std()

# Calculate percentiles (e.g., 25th, 50th, and 75th percentiles)


percentiles = df['Area in sq.ft'].quantile([0.25, 0.50, 0.75])

print(f"Mean Area in sq.ft: {mean_area}")


print(f"Median Area in sq.ft: {median_area}")
print(f"Mode Area in sq.ft: {mode_area}")
print(f"Standard Deviation Area in sq.ft: {std_deviation_area}")
print(f"25th Percentile Area in sq.ft: {percentiles[0.25]}")
print(f"50th Percentile (Median) Area in sq.ft: {percentiles[0.50]}")
print(f"75th Percentile Area in sq.ft: {percentiles[0.75]}")

Mean Area in sq.ft: 1993.3395784543325


Median Area in sq.ft: 1387.0
Mode Area in sq.ft: 0 1000
Name: Area in sq.ft, dtype: int64
Standard Deviation Area in sq.ft: 1624.976334794183
25th Percentile Area in sq.ft: 1100.0
50th Percentile (Median) Area in sq.ft: 1387.0
75th Percentile Area in sq.ft: 2200.0

In [78]: # Create a frequency table for the 'Location' column


location_frequency = df['Location'].value_counts().reset_index()
location_frequency.columns = ['Location', 'Frequency']

# Display the frequency table


print(location_frequency)

Location Frequency
0 Electronics City 141
1 Koramangala 68
2 Indira Nagar 56
3 HSR Layout 50
4 Ulsoor 46
5 Bellandur 40
6 Sarjapur 19
7 Devanahalli 7

Identifing Outliers for the column Area in sq.ft


In [48]: zscore=stats.zscore(df['Area in sq.ft'])
outlier =( zscore >3 )| (zscore<-3 )
Q1=df['Area in sq.ft'].quantile(0.25)
Loading [MathJax]/extensions/Safe.js
Q3=df['Area in sq.ft'].quantile(0.75)
IQR = Q3 -Q1
outliers = (df['Area in sq.ft'] < (Q1-1.5 *IQR)) | (df['Area in sq.ft']> (IQR+1.5 * IQR)
plt.boxplot(df['Area in sq.ft'])
plt.show()

Data Visualization:
In [49]: df.columns

Index(['Type of House', 'Type of BHK', 'Location', 'Area in sq.ft',


Out[49]:
'Construction Status', 'Facing', 'New_or_Resale',
'PropertyPrice_in_lakhs', 'PriceRange'],
dtype='object')

Univariate Analysis

Location
In [50]: plt.figure(figsize=(10,7))
a=sns.countplot(x='Location',data=df ,order=df['Location'].value_counts().index)
plt.xticks(rotation=90)
for p in a.patches:
a.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
ha='center', va='bottom', fontsize=8, color='black')
plt.tight_layout()
plt.show()

Loading [MathJax]/extensions/Safe.js
Type Of House
In [79]: plt.figure(figsize=(8, 5))
a=sns.countplot(x='Type of House',data=df ,order=df['Type of House'].value_counts().inde
plt.xticks(rotation=90)
for p in a.patches:
a.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
ha='center', va='bottom', fontsize=8, color='black')
plt.tight_layout()
plt.show()

Loading [MathJax]/extensions/Safe.js
Price Range
In [53]: plt.figure(figsize=(8, 5))
a=sns.countplot(x='PriceRange',data=df ,order=df['PriceRange'].value_counts().index)
plt.xticks(rotation=90)
for p in a.patches:
a.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
ha='center', va='bottom', fontsize=8, color='black')
plt.tight_layout()
plt.show()

Tree Map Plot For Location


In [54]: import plotly.express as px
Loading [MathJax]/extensions/Safe.js
plt.figure(figsize=(10,8))
fig = px.treemap(df, path=['Location'])
fig.update_layout(title="Treemap Plot for Location")
fig.show()

C:\Users\hanum\anaconda3\lib\site-packages\plotly\express\_core.py:1637: FutureWarning:
The frame.append method is deprecated and will be removed from pandas in a future versio
n. Use pandas.concat instead.
df_all_trees = df_all_trees.append(df_tree, ignore_index=True)

Treemap Plot for Location

Electronics City Koramangala HSR Layout U

Indira Nagar
Bellandur

<Figure size 720x576 with 0 Axes>

Density Plot for Area in sq.ft


In [85]: sns.set_style("whitegrid")
plt.figure(figsize=(8, 6))
sns.kdeplot(data=df['Area in sq.ft'], shade=True, color="green")
plt.title("Density Plot for Area in sq.ft")
plt.xlabel("Area in sq.ft")
plt.ylabel("Density")
plt.show()

Loading [MathJax]/extensions/Safe.js
Density Plot for Area in sq.ft
In [87]: sns.set_style("whitegrid")
plt.figure(figsize=(8, 6))
sns.kdeplot(data=df['PropertyPrice_in_lakhs'], shade=True, color="Black")
plt.title("Density Plot for Area in sq.ft")
plt.xlabel("Area in sq.ft")
plt.ylabel("Density")
plt.show()

Loading [MathJax]/extensions/Safe.js
Bivariate Analysis

Box Plot for Location Vs PropertyPrice_in_lakhs


In [50]: fig = plt.subplots(figsize=(10,8))
sns.boxplot(x='Location', y='PropertyPrice_in_lakhs', data=df)
plt.xticks(rotation=90)
plt.show()

Line Plot for Location Vs PropertyPrice_in_lakhs¶


In [51]: fig = plt.subplots(figsize=(13,5))
sns.lineplot(x='Location', y='PropertyPrice_in_lakhs', data=df)
plt.grid()

Loading [MathJax]/extensions/Safe.js
Count Plot for Type of House Vs
PropertyPrice_in_lakhs
In [107… plt.figure(figsize=(20,20))
df2=df[(df["Location"]=="Koramangala")]
ax=df2.groupby("Type of House")["PropertyPrice_in_lakhs"].agg([min,max]).plot(kind="bar"
plt.xticks(rotation=90)
plt.xlabel("Type of House")
plt.ylabel('Price')
for p in ax.patches:
ax.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
ha='center', va='bottom', fontsize=6, color='black')
plt.tight_layout()
plt.show()

<Figure size 1440x1440 with 0 Axes>

Scatter Plot For PropertyPrice_in_lakhs vs Area in


sq.ft"
In [108… plt.figure(figsize=(10,8))
sns.scatterplot(x="PropertyPrice_in_lakhs",y="Area in sq.ft",data=df , alpha=0.7)
Loading [MathJax]/extensions/Safe.js
plt.grid()
plt.show()

Grouped Count Plot for Construction Status and


Facing
In [89]: sns.set(style="whitegrid")
plt.figure(figsize=(12, 10))
sns.countplot(data=df, x="Construction Status", hue="Facing")
plt.title("Grouped Count Plot for Construction Status and Facing")
plt.xlabel("Construction Status")
plt.ylabel("Count")
plt.legend(title="Facing")
plt.show()

Loading [MathJax]/extensions/Safe.js
Sunburst Plot for Construction Status and Facing
In [57]: grouped_data = df.groupby(['Construction Status', 'Facing']).size().reset_index(name='Co

fig = px.sunburst(grouped_data, path=['Construction Status', 'Facing'], values='Count')


fig.update_layout(title="Sunburst Plot for Construction Status and Facing")
fig.show()

C:\Users\hanum\anaconda3\lib\site-packages\plotly\express\_core.py:1637: FutureWarning:

The frame.append method is deprecated and will be removed from pandas in a future versio
n. Use pandas.concat instead.

C:\Users\hanum\anaconda3\lib\site-packages\plotly\express\_core.py:1637: FutureWarning:

The frame.append method is deprecated and will be removed from pandas in a future versio
n. Use pandas.concat instead.

Loading [MathJax]/extensions/Safe.js
Sunburst Plot for Construction Status and Facing

East facing

ove
to m
ady
north facing

Un NorthEast facing

der West facing


Co

Re
nst North
North facing ruc
tion facing

East facing

No
North

rthE
h facing

So
West facing

uth
West

ast

So rth
Ea
no
uth
W
no
st

es
rth
Ea
t fa
st
fa

fac
fa

fa
cin
cin
g
i
In [ ]: Stacked Bar Plot for Type of House and Type of BHK

In [60]: grouped_data = df.groupby(['Type of House', 'Type of BHK']).size().unstack(fill_value=0)

# Create a stacked bar plot


ax = grouped_data.plot(kind='bar', stacked=True, figsize=(8, 6))
plt.title("Stacked Bar Plot for Type of House and Type of BHK")
plt.xlabel("Type of House")
plt.ylabel("Count")
plt.legend(title="Type of BHK")
plt.xticks(rotation=0) # Rotate x-axis labels if needed
plt.show()

Loading [MathJax]/extensions/Safe.js
Waffle Plot for Location based on Area in sq.ft
In [61]: total_area = df["Area in sq.ft"].sum()
df["Proportion"] = df["Area in sq.ft"] / total_area

# Create a waffle plot


fig = plt.figure(figsize=(8, 6))

for i, row in df.iterrows():


num_cells = int(row["Proportion"] * 100)
plt.barh(row["Location"], num_cells, color='skyblue', edgecolor='black')

plt.title("Waffle Plot for Location based on Area in sq.ft")


plt.xlabel("Number of Cells (Each Cell Represents 1%)")
plt.ylabel("Location")
plt.show()

Loading [MathJax]/extensions/Safe.js
Property Price Streamgraph
In [66]: fig = px.area(df, x='Location', y='PropertyPrice_in_lakhs', color='Area in sq.ft', title
fig.update_layout(xaxis_title='Location', yaxis_title='Property Price in lakhs')
fig.show()

Loading [MathJax]/extensions/Safe.js
Property Price Streamgraph

25k

20k
Property Price in lakhs

15k

10k

5k

Streamgraph Plot: Property Price vs. Area by


Location
In [69]: plt.figure(figsize=(10, 6))
plt.stackplot(df['Location'], df['Area in sq.ft'], df['PropertyPrice_in_lakhs'], labels=
plt.legend(loc='upper left')
plt.title('Streamgraph Plot: Property Price vs. Area by Location')
plt.xlabel('Location')
plt.ylabel('Value')
plt.xticks(rotation=90)
plt.show()

Loading [MathJax]/extensions/Safe.js
Multivariate Data Analysis

Pair Plot: Property Price, Area, and Price Range


In [70]: sns.pairplot(df, vars=['PropertyPrice_in_lakhs', 'Area in sq.ft'], hue='PriceRange')
plt.suptitle('Pair Plot: Property Price, Area, and Price Range', y=1.02)
plt.show()

Loading [MathJax]/extensions/Safe.js
Pairplot: Pairwise Relationships
In [91]: sns.pairplot(df, hue='PriceRange', diag_kind='kde')
plt.suptitle("Pairplot: Pairwise Relationships")
plt.show()

In [96]: sns.pairplot(df, hue='Location', diag_kind='kde')


plt.suptitle("Pairplot: Pairwise Relationships")
plt.show()

Loading [MathJax]/extensions/Safe.js
Heatmap
In [97]: fig = plt.figure(figsize=(12,10))
sns.heatmap(df.corr(),cmap="gnuplot2",annot=True)

<AxesSubplot:>
Out[97]:

Loading [MathJax]/extensions/Safe.js
Conclusion
In conclusion, our "Location-Based Flat Rate Data Analysis in Bangalore" project is a testament to the
power of data-driven insights in the real estate arena. By comprehending the intricate relationship between
location and property prices, we empower stakeholders to make well-informed decisions, negotiate
effectively, and embark on their real estate journeys with confidence.

Understanding the price based on location is not merely a matter of estimation; it is a strategic advantage
that enables buyers and investors to navigate the Bangalore real estate market with precision and clarity.
Our project illuminates this path, providing valuable guidance in a realm where every rupee counts and
every location choice matters.

In [ ]:

Loading [MathJax]/extensions/Safe.js

You might also like