Professional Documents
Culture Documents
Project TITLE :
Abstract
The real estate market is a dynamic and complex industry where property prices vary significantly based on
location. This analysis aims to explore the relationship between property types (flats, apartments, and villas)
and their prices in different locations. By analyzing historical property data, we can gain insights into how
location affects property prices and help potential buyers, sellers, and investors make informed decisions.
Introduction
In the real estate market, property prices are influenced by a multitude of factors, with location being a
primary determinant. This analysis focuses on examining the correlation between property types and their
prices across different regions using data sourced from the Makaan website.
In [9]: page=requests.get(url)
In [10]: page
Loading [MathJax]/extensions/Safe.js
<Response [200]>
Out[10]:
In [11]: page.status_code
200
Out[11]:
In [12]: soup=BeautifulSoup(page.text)
for i in container:
a=i.find('td',class_='size')
if a:
Area_in_sq_ft.append(a.text)
else:
Area_in_sq_ft.append(np.nan)
for i in container:
a=i.find('td',class_='val')
if a:
Construction_Status.append(a.text)
Loading [MathJax]/extensions/Safe.js else:
Construction_Status.append(np.nan)
for i in container:
i.find("ul",class_="listing-details")
reg=re.findall("(\w+\sfacing)",i.text)
if reg:
Facing.append(reg[0])
else:
Facing.append(np.nan)
for i in container:
a=i.find('li',class_='keypoint')
if a:
New_or_Resale.append(a.text)
else:
New_or_Resale.append(np.nan)
for i in container:
i.find("div",class_="title-line-wrap")
reg=re.findall("((?:Apartment|Villa|Independent|Flat))",i.text)
if reg:
Type_of_House.append(reg[0])
else:
Type_of_House.append(np.nan)
1658
1658
1658
1658
1658
1658
1658
1658
1658
Loading [MathJax]/extensions/Safe.js
In [15]: Flats_Data = {
'Type of House' : Type_of_House ,
'Type of BHK' : Type_of_BHK ,
'Location' : Location ,
'Area in sq.ft' : Area_in_sq_ft,
'Construction Status' : Construction_Status,
'Facing' : Facing ,
'New_or_Resale' : New_or_Resale ,
'PropertyPrice_in_lakhs' : Property_Price ,
}
df= pd.DataFrame(Flats_Data)
In [74]: df
Out[74]: Type of
Type Area
Construction
of Location in Facing New_or_Resale PropertyPrice_in_lakhs Pr
House Status
BHK sq.ft
1 Ready to East
383 Apartment Ulsoor 618 1 years ago 40
BHK move facing
5 Ready to NorthEast
44 Independent Koramangala 9000 6 years ago 1500
BHK move facing
9 Ready to East
128 Independent Indira Nagar 9600 9 years ago 850
BHK move facing
0 Ready to NorthEast
59 Independent Koramangala 10000 9 years ago 1000
BHK move facing
0 Ready to NorthEast
60 Independent Koramangala 15000 9 years ago 7000
BHK move facing
In [17]: len(df)
480
Out[17]:
In [20]: df=pd.read_csv("REALESTATE_PROJECT.csv")
DATA Cleaning
Loading [MathJax]/extensions/Safe.js
In [21]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 480 entries, 0 to 479
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Type of House 480 non-null object
1 Type of BHK 480 non-null object
2 Location 480 non-null object
3 Area in sq.ft 480 non-null int64
4 Construction Status 480 non-null object
5 Facing 480 non-null object
6 New_or_Resale 480 non-null object
7 PropertyPrice_in_lakhs 480 non-null object
dtypes: int64(1), object(7)
memory usage: 30.1+ KB
In [22]: names_of_flats = [
'Sai Towers', 'Bhrundavan Gardens', 'Hanuman Enclaves', 'Prestige Park Grove',
'Prestige Eden Park', 'Prestige Meridian Park', 'Prestige Avalon Park',
'Prestige Aston Park', 'The Prestige City', 'Mahindra Eden',
'Brigade Komarla Heights', 'Prestige Green Gables'
]
In [ ]:
Type of House 0
Out[26]:
Type of BHK 0
Location 0
Area in sq.ft 0
Construction Status 0
Facing 0
New_or_Resale 0
PropertyPrice_in_lakhs 0
dtype: int64
In [27]: df.info()
Loading [MathJax]/extensions/Safe.js
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 480 entries, 0 to 479
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Type of House 480 non-null object
1 Type of BHK 480 non-null object
2 Location 480 non-null object
3 Area in sq.ft 480 non-null int64
4 Construction Status 480 non-null object
5 Facing 480 non-null object
6 New_or_Resale 480 non-null object
7 PropertyPrice_in_lakhs 480 non-null object
dtypes: int64(1), object(7)
memory usage: 30.1+ KB
numpy.int64
Out[28]:
numpy.int64
Out[29]:
numpy.int64
Out[30]:
numpy.int64
Out[31]:
numpy.int64
Out[32]:
In [33]: type(df['Facing'].isnull().sum())
numpy.int64
Out[33]:
In [34]: type(df['New_or_Resale'].isnull().sum())
numpy.int64
Out[34]:
In [35]: type(df['PropertyPrice_in_lakhs'].isnull().sum())
numpy.int64
Out[35]:
Loading [MathJax]/extensions/Safe.js
Data Inspection
In [36]: df=df[(df['Facing']!='is facing')]
df.reset_index(drop=True ,inplace =True)
In [ ]:
In [38]: df
Out[38]: Type of
Type Area
Construction
of Location in Facing New_or_Resale PropertyPrice_in_lakhs
House Status
BHK sq.ft
3 Ready to East
0 Independent Koramangala 2000 3 Bathrooms 2.75 Cr
BHK move facing
0 Ready to South
1 Villa Koramangala 6000 9 Bathrooms 4 Cr
BHK move facing
3 Ready to East
2 Apartment Koramangala 1945 3 Bathrooms 2 Cr
BHK move facing
4 Ready to NorthEast
3 Apartment Koramangala 4000 4 Bathrooms 10.5 Cr
BHK move facing
4 Ready to East
4 Independent Koramangala 2800 4 Bathrooms 2.6 Cr
BHK move facing
3 Under NorthEast
422 Apartment Devanahalli 1150 2 Bathrooms 50 L
BHK Construction facing
4 Under East
423 Villa Devanahalli 4400 6 Bathrooms 5.8 Cr
BHK Construction facing
2 Under East
424 Apartment Devanahalli 1050 2 Bathrooms 62 L
BHK Construction facing
3 Ready to northEast
426 Apartment Devanahalli 1275 4 - 5 years old 1.46 Cr
BHK move facing
In [39]: df["PropertyPrice_in_lakhs"]
0 2.75 Cr
Out[39]:
1 4 Cr
2 2 Cr
3 10.5 Cr
4 2.6 Cr
...
422 50 L
423 5.8 Cr
424 62 L
425 1.4 Cr
426 1.46 Cr
Name: PropertyPrice_in_lakhs, Length: 427, dtype: object
Loading [MathJax]/extensions/Safe.js
In [40]: df["PropertyPrice_in_lakhs"]= df["PropertyPrice_in_lakhs"].replace({"L":"*1","Cr":"*1e2"
In [41]: PriceRange=[]
for i in df["PropertyPrice_in_lakhs"]:
if i<150:
PriceRange.append("Low")
elif (i>150) & (i<350):
PriceRange.append("Medium")
elif(i>350) & (i<500):
PriceRange.append("High")
else:
PriceRange.append("Very High")
df["PriceRange"]=PriceRange
In [42]: df
Out[42]: Type of
Type Area
Construction
of Location in Facing New_or_Resale PropertyPrice_in_lakhs Pric
House Status
BHK sq.ft
3 Ready to East
0 Independent Koramangala 2000 3 Bathrooms 275
BHK move facing
0 Ready to South
1 Villa Koramangala 6000 9 Bathrooms 400
BHK move facing
3 Ready to East
2 Apartment Koramangala 1945 3 Bathrooms 200
BHK move facing
4 Ready to NorthEast
3 Apartment Koramangala 4000 4 Bathrooms 1050 V
BHK move facing
4 Ready to East
4 Independent Koramangala 2800 4 Bathrooms 260
BHK move facing
3 Under NorthEast
422 Apartment Devanahalli 1150 2 Bathrooms 50
BHK Construction facing
4 Under East
423 Villa Devanahalli 4400 6 Bathrooms 580 V
BHK Construction facing
2 Under East
424 Apartment Devanahalli 1050 2 Bathrooms 62
BHK Construction facing
3 Ready to northEast
426 Apartment Devanahalli 1275 4 - 5 years old 146
BHK move facing
In [43]: df.to_csv("project_cleaned_data.csv")
In [44]: df
Loading [MathJax]/extensions/Safe.js
Out[44]: Type of
Type Area
Construction
of Location in Facing New_or_Resale PropertyPrice_in_lakhs Pric
House Status
BHK sq.ft
3 Ready to East
0 Independent Koramangala 2000 3 Bathrooms 275
BHK move facing
0 Ready to South
1 Villa Koramangala 6000 9 Bathrooms 400
BHK move facing
3 Ready to East
2 Apartment Koramangala 1945 3 Bathrooms 200
BHK move facing
4 Ready to NorthEast
3 Apartment Koramangala 4000 4 Bathrooms 1050 V
BHK move facing
4 Ready to East
4 Independent Koramangala 2800 4 Bathrooms 260
BHK move facing
3 Under NorthEast
422 Apartment Devanahalli 1150 2 Bathrooms 50
BHK Construction facing
4 Under East
423 Villa Devanahalli 4400 6 Bathrooms 580 V
BHK Construction facing
2 Under East
424 Apartment Devanahalli 1050 2 Bathrooms 62
BHK Construction facing
3 Ready to northEast
426 Apartment Devanahalli 1275 4 - 5 years old 146
BHK move facing
In [45]: df.New_or_Resale
0 3 Bathrooms
Out[45]:
1 9 Bathrooms
2 3 Bathrooms
3 4 Bathrooms
4 4 Bathrooms
...
422 2 Bathrooms
423 6 Bathrooms
424 2 Bathrooms
425 Possession by Mar 2024
426 4 - 5 years old
Name: New_or_Resale, Length: 427, dtype: object
In [47]: df
Loading [MathJax]/extensions/Safe.js
Out[47]: Type of
Type Area
Construction
of Location in Facing New_or_Resale PropertyPrice_in_lakhs Pric
House Status
BHK sq.ft
3 Ready to East
0 Independent Koramangala 2000 3 years ago 275
BHK move facing
0 Ready to South
1 Villa Koramangala 6000 9 years ago 400
BHK move facing
3 Ready to East
2 Apartment Koramangala 1945 3 years ago 200
BHK move facing
4 Ready to NorthEast
3 Apartment Koramangala 4000 4 years ago 1050 V
BHK move facing
4 Ready to East
4 Independent Koramangala 2800 4 years ago 260
BHK move facing
3 Under NorthEast
422 Apartment Devanahalli 1150 2 years ago 50
BHK Construction facing
4 Under East
423 Villa Devanahalli 4400 6 years ago 580 V
BHK Construction facing
2 Under East
424 Apartment Devanahalli 1050 2 years ago 62
BHK Construction facing
3 Ready to northEast
426 Apartment Devanahalli 1275 4 - 5 years old 146
BHK move facing
Summary Statistics
In [76]: # Calculate mean
mean_price = df['PropertyPrice_in_lakhs'].mean()
# Calculate median
median_price = df['PropertyPrice_in_lakhs'].median()
print(f"Mean: {mean_price}")
print(f"Median: {median_price}")
print(f"Mode: {mode_price}")
print(f"Standard Deviation: {std_deviation_price}")
print(f"25th Percentile: {percentiles[0.25]}")
print(f"50th Percentile (Median): {percentiles[0.50]}")
print(f"75th Percentile: {percentiles[0.75]}")
Loading [MathJax]/extensions/Safe.js
Mean: 202.423887587822
Median: 92.0
Mode: 0 40
Name: PropertyPrice_in_lakhs, dtype: int32
Standard Deviation: 400.46512944016393
25th Percentile: 41.5
50th Percentile (Median): 92.0
75th Percentile: 246.0
# Calculate median
median_area = df['Area in sq.ft'].median()
Location Frequency
0 Electronics City 141
1 Koramangala 68
2 Indira Nagar 56
3 HSR Layout 50
4 Ulsoor 46
5 Bellandur 40
6 Sarjapur 19
7 Devanahalli 7
Data Visualization:
In [49]: df.columns
Univariate Analysis
Location
In [50]: plt.figure(figsize=(10,7))
a=sns.countplot(x='Location',data=df ,order=df['Location'].value_counts().index)
plt.xticks(rotation=90)
for p in a.patches:
a.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
ha='center', va='bottom', fontsize=8, color='black')
plt.tight_layout()
plt.show()
Loading [MathJax]/extensions/Safe.js
Type Of House
In [79]: plt.figure(figsize=(8, 5))
a=sns.countplot(x='Type of House',data=df ,order=df['Type of House'].value_counts().inde
plt.xticks(rotation=90)
for p in a.patches:
a.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
ha='center', va='bottom', fontsize=8, color='black')
plt.tight_layout()
plt.show()
Loading [MathJax]/extensions/Safe.js
Price Range
In [53]: plt.figure(figsize=(8, 5))
a=sns.countplot(x='PriceRange',data=df ,order=df['PriceRange'].value_counts().index)
plt.xticks(rotation=90)
for p in a.patches:
a.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
ha='center', va='bottom', fontsize=8, color='black')
plt.tight_layout()
plt.show()
C:\Users\hanum\anaconda3\lib\site-packages\plotly\express\_core.py:1637: FutureWarning:
The frame.append method is deprecated and will be removed from pandas in a future versio
n. Use pandas.concat instead.
df_all_trees = df_all_trees.append(df_tree, ignore_index=True)
Indira Nagar
Bellandur
Loading [MathJax]/extensions/Safe.js
Density Plot for Area in sq.ft
In [87]: sns.set_style("whitegrid")
plt.figure(figsize=(8, 6))
sns.kdeplot(data=df['PropertyPrice_in_lakhs'], shade=True, color="Black")
plt.title("Density Plot for Area in sq.ft")
plt.xlabel("Area in sq.ft")
plt.ylabel("Density")
plt.show()
Loading [MathJax]/extensions/Safe.js
Bivariate Analysis
Loading [MathJax]/extensions/Safe.js
Count Plot for Type of House Vs
PropertyPrice_in_lakhs
In [107… plt.figure(figsize=(20,20))
df2=df[(df["Location"]=="Koramangala")]
ax=df2.groupby("Type of House")["PropertyPrice_in_lakhs"].agg([min,max]).plot(kind="bar"
plt.xticks(rotation=90)
plt.xlabel("Type of House")
plt.ylabel('Price')
for p in ax.patches:
ax.annotate(f'{p.get_height()}', (p.get_x() + p.get_width() / 2., p.get_height()),
ha='center', va='bottom', fontsize=6, color='black')
plt.tight_layout()
plt.show()
Loading [MathJax]/extensions/Safe.js
Sunburst Plot for Construction Status and Facing
In [57]: grouped_data = df.groupby(['Construction Status', 'Facing']).size().reset_index(name='Co
C:\Users\hanum\anaconda3\lib\site-packages\plotly\express\_core.py:1637: FutureWarning:
The frame.append method is deprecated and will be removed from pandas in a future versio
n. Use pandas.concat instead.
C:\Users\hanum\anaconda3\lib\site-packages\plotly\express\_core.py:1637: FutureWarning:
The frame.append method is deprecated and will be removed from pandas in a future versio
n. Use pandas.concat instead.
Loading [MathJax]/extensions/Safe.js
Sunburst Plot for Construction Status and Facing
East facing
ove
to m
ady
north facing
Un NorthEast facing
Re
nst North
North facing ruc
tion facing
East facing
No
North
rthE
h facing
So
West facing
uth
West
ast
So rth
Ea
no
uth
W
no
st
es
rth
Ea
t fa
st
fa
fac
fa
fa
cin
cin
g
i
In [ ]: Stacked Bar Plot for Type of House and Type of BHK
Loading [MathJax]/extensions/Safe.js
Waffle Plot for Location based on Area in sq.ft
In [61]: total_area = df["Area in sq.ft"].sum()
df["Proportion"] = df["Area in sq.ft"] / total_area
Loading [MathJax]/extensions/Safe.js
Property Price Streamgraph
In [66]: fig = px.area(df, x='Location', y='PropertyPrice_in_lakhs', color='Area in sq.ft', title
fig.update_layout(xaxis_title='Location', yaxis_title='Property Price in lakhs')
fig.show()
Loading [MathJax]/extensions/Safe.js
Property Price Streamgraph
25k
20k
Property Price in lakhs
15k
10k
5k
Loading [MathJax]/extensions/Safe.js
Multivariate Data Analysis
Loading [MathJax]/extensions/Safe.js
Pairplot: Pairwise Relationships
In [91]: sns.pairplot(df, hue='PriceRange', diag_kind='kde')
plt.suptitle("Pairplot: Pairwise Relationships")
plt.show()
Loading [MathJax]/extensions/Safe.js
Heatmap
In [97]: fig = plt.figure(figsize=(12,10))
sns.heatmap(df.corr(),cmap="gnuplot2",annot=True)
<AxesSubplot:>
Out[97]:
Loading [MathJax]/extensions/Safe.js
Conclusion
In conclusion, our "Location-Based Flat Rate Data Analysis in Bangalore" project is a testament to the
power of data-driven insights in the real estate arena. By comprehending the intricate relationship between
location and property prices, we empower stakeholders to make well-informed decisions, negotiate
effectively, and embark on their real estate journeys with confidence.
Understanding the price based on location is not merely a matter of estimation; it is a strategic advantage
that enables buyers and investors to navigate the Bangalore real estate market with precision and clarity.
Our project illuminates this path, providing valuable guidance in a realm where every rupee counts and
every location choice matters.
In [ ]:
Loading [MathJax]/extensions/Safe.js