Professional Documents
Culture Documents
Assignment 2.
Assignment 2.
ipynb - Colaboratory
import numpy as np
import pandas as pd
from scipy import *
df = pd.read_csv("/content/AB_NYC_2019.csv")
df
https://colab.research.google.com/drive/1ZkFDzqf7m4NMzlLg0_cB3_z1JNQ2YN93#printMode=true 1/9
DSVSML assignment 2.ipynb - Colaboratory
Skylit Midtown
1 id 2595 name host_id host_name
2845 neighbourhood_group
Jennifer neighbourhood
Manhattan M
Castle
Clean & quiet
THE VILLAGE
0 2539 apt home by the 2787 John Brooklyn Kensington
OF
2 3647 park 4632 Elisabeth Manhattan
HARLEM....NEW
Skylit Midtown YORK !
1 2595 2845 Jennifer Manhattan Midtown
Castle
Cozy Entire
3 3831 VILLAGE Floor of
THE 4869 LisaRoxanne Brooklyn Clin
Brownstone
OF
2 3647 4632 Elisabeth Manhattan Harlem
HARLEM....NEW
Entire Apt:
YORK !
Spacious
4 5022 7192 Laura Manhattan East
Studio/Loft
Cozy Entire by
3 3831 central
Floor of park
4869 LisaRoxanne Brooklyn Clinton Hill
Brownstone
... ... ... ... ... ...
Entire Apt:
Charming one
Spacious
4 5022 bedroom - newly
7192 Laura Manhattan East Harlem B
48890 36484665
Studio/Loft by 8232441 Sabrina Brooklyn
renovated Stuy
central park
rowhouse
Large Cozy 1 BR
Affordable room
5 5099 Apartment In 7322 Chris Manhattan Murray Hill
48891 36485057 in Bushwick/East 6570630 Marisol Brooklyn Bu
Midtown East
Williamsburg
Bedford-
6 5121 Sunny Studio
BlissArtsSpace! at
7356 Garon Brooklyn
Stuyvesant
48892 36485431 Historical 23492952 Ilgar & Aysel Manhattan
Neighborhood
Large Furnished
7 5178 Room Near 8967 Shunichi Manhattan Hell's Kitchen
43rd St. Time
B'way
48893 36485609 Square-cozy 30985759 Taz Manhattan Hell's K
Cozy Cleansingle bed
Upper West
8 5203 Guest Room - 7490 MaryEllen Manhattan
Trendy duplex in Side
Family Apt
48894 36487245 the very heart of 68119814 Christophe Manhattan Hell's K
Hell's
Cute & Cozy Kitchen
9 5238
48895 rows ×Lower East Side
16 columns 7549 Ben Manhattan Chinatown
1 bdrm
df.isna().sum() #retriving total number of null values in each columns
id 0
name 16
host_id 0
host_name 21
neighbourhood_group 0
neighbourhood 0
latitude 0
longitude 0
room_type 0
https://colab.research.google.com/drive/1ZkFDzqf7m4NMzlLg0_cB3_z1JNQ2YN93#printMode=true 2/9
DSVSML assignment 2.ipynb - Colaboratory
price 0
minimum_nights 0
number_of_reviews 0
last_review 10052
reviews_per_month 10052
calculated_host_listings_count 0
availability_365 0
dtype: int64
import seaborn as sns
sns.heatmap(df.isnull(), cbar=False)
<matplotlib.axes._subplots.AxesSubplot at 0x7fe41c763750>
df.shape
(48895, 16)
df['host_id'].max()
274321313
top_host_id = df['host_id'].value_counts().head(10)
top_host_id
219517861 327
107434423 232
30283594 121
137358866 103
12243051 96
16098958 96
61391963 91
https://colab.research.google.com/drive/1ZkFDzqf7m4NMzlLg0_cB3_z1JNQ2YN93#printMode=true 3/9
DSVSML assignment 2.ipynb - Colaboratory
22541573 87
200380610 65
7503643 52
sns.set(rc={'figure.figsize':(10,8)})
viz_bar = top_host_id.plot(kind='bar')
viz_bar.set_title('Hosts with the most listings')
viz_bar.set_xlabel('Host IDs')
viz_bar.set_ylabel('Count of listings')
viz_bar.set_xticklabels(viz_bar.get_xticklabels(), rotation=45)
[Text(0, 0, '219517861'),
Text(0, 0, '107434423'),
Text(0, 0, '30283594'),
Text(0, 0, '137358866'),
Text(0, 0, '12243051'),
Text(0, 0, '16098958'),
Text(0, 0, '61391963'),
Text(0, 0, '22541573'),
Text(0, 0, '200380610'),
Text(0, 0, '7503643')]
sns.set_style('dark')
sns.boxplot(x='host_id',y='price',data=df,whis=0.5)
https://colab.research.google.com/drive/1ZkFDzqf7m4NMzlLg0_cB3_z1JNQ2YN93#printMode=true 4/9
DSVSML assignment 2.ipynb - Colaboratory
<matplotlib.axes._subplots.AxesSubplot at 0x7fdc04915690>
df1 = df.dropna() #discarding null values
df1
https://colab.research.google.com/drive/1ZkFDzqf7m4NMzlLg0_cB3_z1JNQ2YN93#printMode=true 5/9
DSVSML assignment 2.ipynb - Colaboratory
Clean &
quiet apt
0 2539 2787 John Brooklyn Kens
home by
the park
Skylit
1 2595 Midtown 2845 Jennifer Manhattan Mi
Castle
Cozy Entire
3 3831 Floor of 4869 LisaRoxanne Brooklyn Clint
Brownstone
Entire Apt:
Spacious
4 5022 Studio/Loft 7192 Laura Manhattan East H
by central
park
Large Cozy
1 BR
5 5099 Apartment 7322 Chris Manhattan Murr
In Midtown
East
(df.loc[df['host_id'] == 7989])
No.2 with
48790 36427429 queen size 257683179 H Ai Queens Flu
bed
id name host_id host_name neighbourhood_group neighbourhood l
Seas The
48799 36438336 Central 211644523 Ben Staten Island Grea
Moment
11 5441 Manhattan/near 7989 Kate Manhattan Hell's Kitchen 4
Broadway
1B-1B
apartment
48805 36442252 273841667 Blaine Bronx Mott H
near by
Metro
df['room_type'].value_counts()
Cozy
Private
Entire home/apt 25409
Bushwick,
Shared room 1160
Brooklyn
Name: room_type, dtype: int64
38821 rows × 16 columns
maxi = df.groupby('room_type').price.max()
maxi
room_type
https://colab.research.google.com/drive/1ZkFDzqf7m4NMzlLg0_cB3_z1JNQ2YN93#printMode=true 6/9
DSVSML assignment 2.ipynb - Colaboratory
<matplotlib.axes._subplots.AxesSubplot at 0x7fe400e2d6d0>
sns.lineplot(x='neighbourhood',y='availability_365',data=df)
https://colab.research.google.com/drive/1ZkFDzqf7m4NMzlLg0_cB3_z1JNQ2YN93#printMode=true 7/9
DSVSML assignment 2.ipynb - Colaboratory
<matplotlib.axes._subplots.AxesSubplot at 0x7fe400aa2690>
maxa = df.groupby('neighbourhood').price.max()
maxa
neighbourhood
Allerton 450
Arden Heights 83
Arrochar 625
Arverne 1500
Astoria 10000
...
Woodhaven 250
Woodlawn 85
Woodrow 700
Woodside 500
sns.lineplot(x='neighbourhood',y='price',data=df)
<matplotlib.axes._subplots.AxesSubplot at 0x7fe3feb4ead0>
df.groupby('neighbourhood').availability_365.sum()
https://colab.research.google.com/drive/1ZkFDzqf7m4NMzlLg0_cB3_z1JNQ2YN93#printMode=true 8/9
DSVSML assignment 2.ipynb - Colaboratory
neighbourhood
Allerton 6874
Arrochar 5372
Arverne 14509
Astoria 98272
...
Woodhaven 17681
Woodlawn 1081
Woodrow 0
Woodside 30601
https://colab.research.google.com/drive/1ZkFDzqf7m4NMzlLg0_cB3_z1JNQ2YN93#printMode=true 9/9