You are on page 1of 39

7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [1]:

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

Matplotlib is building the font cache; this may take a moment.

1:importing the dataset(311 NYC service request)

In [2]:

dataset=pd.read_csv("311_Service_Requests_from_2010_to_Present.csv")

pd.options.display.max_columns = None

dataset.head()

Out[2]:

Unique Created Closed Agency Complaint


Agency Descriptor Location T
Key Date Date Name Type

12/31/2015 New York


01-01- Noise - Loud
0 32310363 11:59:45 NYPD City Police Street/Sidew
16 0:55 Street/Sidewalk Music/Party
PM Department

12/31/2015 New York


01-01- Blocked
1 32309934 11:59:44 NYPD City Police No Access Street/Sidew
16 1:26 Driveway
PM Department

12/31/2015 New York


01-01- Blocked
2 32309159 11:59:29 NYPD City Police No Access Street/Sidew
16 4:51 Driveway
PM Department

12/31/2015 New York Commercial


01-01-
3 32305098 11:57:46 NYPD City Police Illegal Parking Overnight Street/Sidew
16 7:43
PM Department Parking

12/31/2015 New York


01-01- Blocked
4 32306529 11:56:58 NYPD City Police Illegal Parking Street/Sidew
16 3:24 Sidewalk
PM Department

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 1/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [3]:

dataset.info()

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 2/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

<class 'pandas.core.frame.DataFrame'>

RangeIndex: 4975 entries, 0 to 4974

Data columns (total 53 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 Unique Key 4975 non-null int64

1 Created Date 4975 non-null object

2 Closed Date 4953 non-null object

3 Agency 4974 non-null object

4 Agency Name 4974 non-null object

5 Complaint Type 4974 non-null object

6 Descriptor 4887 non-null object

7 Location Type 4974 non-null object

8 Incident Zip 4945 non-null float64

9 Incident Address 4440 non-null object

10 Street Name 4440 non-null object

11 Cross Street 1 4383 non-null object

12 Cross Street 2 4374 non-null object

13 Intersection Street 1 532 non-null object

14 Intersection Street 2 526 non-null object

15 Address Type 4944 non-null object

16 City 4945 non-null object

17 Landmark 3 non-null object

18 Facility Type 4954 non-null object

19 Status 4974 non-null object

20 Due Date 4974 non-null object

21 Resolution Description 4974 non-null object

22 Resolution Action Updated Date 4954 non-null object

23 Community Board 4974 non-null object

24 Borough 4974 non-null object

25 X Coordinate (State Plane) 4935 non-null float64

26 Y Coordinate (State Plane) 4935 non-null float64

27 Park Facility Name 4974 non-null object

28 Park Borough 4974 non-null object

29 School Name 4974 non-null object

30 School Number 4974 non-null object

31 School Region 4974 non-null object

32 School Code 4974 non-null object

33 School Phone Number 4974 non-null object

34 School Address 4974 non-null object

35 School City 4974 non-null object

36 School State 4974 non-null object

37 School Zip 4974 non-null object

38 School Not Found 4974 non-null object

39 School or Citywide Complaint 0 non-null float64

40 Vehicle Type 0 non-null float64

41 Taxi Company Borough 0 non-null float64

42 Taxi Pick Up Location 0 non-null float64

43 Bridge Highway Name 1 non-null object

44 Bridge Highway Direction 1 non-null object

45 Road Ramp 1 non-null object

46 Bridge Highway Segment 1 non-null object

47 Garage Lot Name 0 non-null float64

48 Ferry Direction 0 non-null float64

49 Ferry Terminal Name 0 non-null float64

50 Latitude 4935 non-null float64

51 Longitude 4935 non-null float64

52 Location 4935 non-null object

dtypes: float64(12), int64(1), object(40)

memory usage: 2.0+ MB

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 3/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [4]:

dataset.dtypes

Out[4]:

Unique Key int64

Created Date object

Closed Date object

Agency object

Agency Name object

Complaint Type object

Descriptor object

Location Type object

Incident Zip float64

Incident Address object

Street Name object

Cross Street 1 object

Cross Street 2 object

Intersection Street 1 object

Intersection Street 2 object

Address Type object

City object

Landmark object

Facility Type object

Status object

Due Date object

Resolution Description object

Resolution Action Updated Date object

Community Board object

Borough object

X Coordinate (State Plane) float64

Y Coordinate (State Plane) float64

Park Facility Name object

Park Borough object

School Name object

School Number object

School Region object

School Code object

School Phone Number object

School Address object

School City object

School State object

School Zip object

School Not Found object

School or Citywide Complaint float64

Vehicle Type float64

Taxi Company Borough float64

Taxi Pick Up Location float64

Bridge Highway Name object

Bridge Highway Direction object

Road Ramp object

Bridge Highway Segment object

Garage Lot Name float64

Ferry Direction float64

Ferry Terminal Name float64

Latitude float64

Longitude float64

Location object

dtype: object

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 4/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

2A:Read or convert the columns ‘Created Date’ and Closed Date’ to datetime
datatype

In [5]:

dataset['Created Date']=pd.to_datetime(dataset['Created Date'])

dataset['Closed Date']=pd.to_datetime(dataset['Closed Date'])

In [6]:

dataset.head()

Out[6]:

Unique Created Closed Agency Complaint


Agency Descriptor Location Ty
Key Date Date Name Type

2015- 2016- New York


Noise - Loud
0 32310363 12-31 01-01 NYPD City Police Street/Sidew
Street/Sidewalk Music/Party
23:59:45 00:55:00 Department

2015- 2016- New York


Blocked
1 32309934 12-31 01-01 NYPD City Police No Access Street/Sidew
Driveway
23:59:44 01:26:00 Department

2015- 2016- New York


Blocked
2 32309159 12-31 01-01 NYPD City Police No Access Street/Sidew
Driveway
23:59:29 04:51:00 Department

2015- 2016- New York Commercial


3 32305098 12-31 01-01 NYPD City Police Illegal Parking Overnight Street/Sidew
23:57:46 07:43:00 Department Parking

2015- 2016- New York


Blocked
4 32306529 12-31 01-01 NYPD City Police Illegal Parking Street/Sidew
Sidewalk
23:56:58 03:24:00 Department

2B: create a new column ‘Request_Closing_Time’ as the time elapsed between


request creation and request closing

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 5/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [7]:

dataset['Request_Closing_Time'] = dataset['Closed Date']-dataset['Created Date']

dataset.head()

Out[7]:

Unique Created Closed Agency Complaint


Agency Descriptor Location Ty
Key Date Date Name Type

2015- 2016- New York


Noise - Loud
0 32310363 12-31 01-01 NYPD City Police Street/Sidew
Street/Sidewalk Music/Party
23:59:45 00:55:00 Department

2015- 2016- New York


Blocked
1 32309934 12-31 01-01 NYPD City Police No Access Street/Sidew
Driveway
23:59:44 01:26:00 Department

2015- 2016- New York


Blocked
2 32309159 12-31 01-01 NYPD City Police No Access Street/Sidew
Driveway
23:59:29 04:51:00 Department

2015- 2016- New York Commercial


3 32305098 12-31 01-01 NYPD City Police Illegal Parking Overnight Street/Sidew
23:57:46 07:43:00 Department Parking

2015- 2016- New York


Blocked
4 32306529 12-31 01-01 NYPD City Police Illegal Parking Street/Sidew
Sidewalk
23:56:58 03:24:00 Department

3:Provide major insights/patterns

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 6/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [8]:

dataset.columns

Out[8]:

Index(['Unique Key', 'Created Date', 'Closed Date', 'Agency', 'Agency Nam


e',

'Complaint Type', 'Descriptor', 'Location Type', 'Incident Zip',

'Incident Address', 'Street Name', 'Cross Street 1', 'Cross Street


2',

'Intersection Street 1', 'Intersection Street 2', 'Address Type',

'City', 'Landmark', 'Facility Type', 'Status', 'Due Date',

'Resolution Description', 'Resolution Action Updated Date',

'Community Board', 'Borough', 'X Coordinate (State Plane)',

'Y Coordinate (State Plane)', 'Park Facility Name', 'Park Borough',

'School Name', 'School Number', 'School Region', 'School Code',

'School Phone Number', 'School Address', 'School City', 'School Sta


te',

'School Zip', 'School Not Found', 'School or Citywide Complaint',

'Vehicle Type', 'Taxi Company Borough', 'Taxi Pick Up Location',

'Bridge Highway Name', 'Bridge Highway Direction', 'Road Ramp',

'Bridge Highway Segment', 'Garage Lot Name', 'Ferry Direction',

'Ferry Terminal Name', 'Latitude', 'Longitude', 'Location',

'Request_Closing_Time'],

dtype='object')

In [9]:

# Dropping Non releavent columns

dataset.drop(['Incident Address', 'Street Name', 'Cross Street 1', 'Cross Street 2','In
tersection Street 1', 'Intersection Street 2',

'Resolution Description','Resolution Action Updated Date','Community Boar


d','X Coordinate (State Plane)','School or Citywide Complaint',

'Vehicle Type','Taxi Company Borough','Taxi Pick Up Location','Garage Lot Name','Sc


hool Name', 'School Number',

'School Region', 'School Code','School Phone Number', 'School Address',


'School City', 'School State',

'School Zip', 'School Not Found','Ferry Direction', 'Ferry Terminal Name','Uniqu


e Key','Bridge Highway Name',

'Bridge Highway Direction', 'Road Ramp', 'Bridge Highway Segment'],axis=1,inplac


e=True)

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 7/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [10]:

dataset.head()

Out[10]:

Created Closed Agency Complaint Inciden


Agency Descriptor Location Type
Date Date Name Type Zi

2015- 2016- New York


Noise - Loud
0 12-31 01-01 NYPD City Police Street/Sidewalk 10034.
Street/Sidewalk Music/Party
23:59:45 00:55:00 Department

2015- 2016- New York


Blocked
1 12-31 01-01 NYPD City Police No Access Street/Sidewalk 11105.
Driveway
23:59:44 01:26:00 Department

2015- 2016- New York


Blocked
2 12-31 01-01 NYPD City Police No Access Street/Sidewalk 10458.
Driveway
23:59:29 04:51:00 Department

2015- 2016- New York Commercial


3 12-31 01-01 NYPD City Police Illegal Parking Overnight Street/Sidewalk 10461.
23:57:46 07:43:00 Department Parking

2015- 2016- New York


Blocked
4 12-31 01-01 NYPD City Police Illegal Parking Street/Sidewalk 11373.
Sidewalk
23:56:58 03:24:00 Department

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 8/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [11]:

dataset.info()

<class 'pandas.core.frame.DataFrame'>

RangeIndex: 4975 entries, 0 to 4974

Data columns (total 22 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 Created Date 4975 non-null datetime64[ns]

1 Closed Date 4953 non-null datetime64[ns]

2 Agency 4974 non-null object

3 Agency Name 4974 non-null object

4 Complaint Type 4974 non-null object

5 Descriptor 4887 non-null object

6 Location Type 4974 non-null object

7 Incident Zip 4945 non-null float64

8 Address Type 4944 non-null object

9 City 4945 non-null object

10 Landmark 3 non-null object

11 Facility Type 4954 non-null object

12 Status 4974 non-null object

13 Due Date 4974 non-null object

14 Borough 4974 non-null object

15 Y Coordinate (State Plane) 4935 non-null float64

16 Park Facility Name 4974 non-null object

17 Park Borough 4974 non-null object

18 Latitude 4935 non-null float64

19 Longitude 4935 non-null float64

20 Location 4935 non-null object

21 Request_Closing_Time 4953 non-null timedelta64[ns]

dtypes: datetime64[ns](2), float64(4), object(15), timedelta64[ns](1)

memory usage: 855.2+ KB

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 9/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [12]:

dataset.isna().sum()

Out[12]:

Created Date 0

Closed Date 22

Agency 1

Agency Name 1

Complaint Type 1

Descriptor 88

Location Type 1

Incident Zip 30

Address Type 31

City 30

Landmark 4972

Facility Type 21

Status 1

Due Date 1

Borough 1

Y Coordinate (State Plane) 40

Park Facility Name 1

Park Borough 1

Latitude 40

Longitude 40

Location 40

Request_Closing_Time 22

dtype: int64

In [13]:

#dropping created and closed date

dataset.drop(['Closed Date','Created Date'],axis=1,inplace=True)

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 10/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [14]:

dataset.head()

Out[14]:

Agency Complaint Incident Address


Agency Descriptor Location Type
Name Type Zip Type

New York
Noise - Loud
0 NYPD City Police Street/Sidewalk 10034.0 ADDRESS NEW
Street/Sidewalk Music/Party
Department

New York
Blocked
1 NYPD City Police No Access Street/Sidewalk 11105.0 ADDRESS AS
Driveway
Department

New York
Blocked
2 NYPD City Police No Access Street/Sidewalk 10458.0 ADDRESS B
Driveway
Department

New York Commercial


3 NYPD City Police Illegal Parking Overnight Street/Sidewalk 10461.0 ADDRESS B
Department Parking

New York
Blocked
4 NYPD City Police Illegal Parking Street/Sidewalk 11373.0 ADDRESS ELM
Sidewalk
Department

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 11/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [15]:

#dealing with missing values

dataset.isna().sum()

Out[15]:

Agency 1

Agency Name 1

Complaint Type 1

Descriptor 88

Location Type 1

Incident Zip 30

Address Type 31

City 30

Landmark 4972

Facility Type 21

Status 1

Due Date 1

Borough 1

Y Coordinate (State Plane) 40

Park Facility Name 1

Park Borough 1

Latitude 40

Longitude 40

Location 40

Request_Closing_Time 22

dtype: int64

In [16]:

dataset['Agency'].value_counts()

dataset['Agency Name'].value_counts()

sns.countplot(dataset['Agency Name'])

/usr/local/lib/python3.7/site-packages/seaborn/_decorators.py:43: FutureWa
rning: Pass the following variable as a keyword arg: x. From version 0.12,
the only valid positional argument will be `data`, and passing other argum
ents without an explicit keyword will result in an error or misinterpretat
ion.

FutureWarning

Out[16]:

<AxesSubplot:xlabel='Agency Name', ylabel='count'>

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 12/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [17]:

dataset['Complaint Type'].value_counts().head()

plot=sns.countplot(dataset['Complaint Type'])

plot.set_xticklabels(plot.get_xticklabels(),rotation=90)

/usr/local/lib/python3.7/site-packages/seaborn/_decorators.py:43: FutureWa
rning: Pass the following variable as a keyword arg: x. From version 0.12,
the only valid positional argument will be `data`, and passing other argum
ents without an explicit keyword will result in an error or misinterpretat
ion.

FutureWarning

Out[17]:

[Text(0, 0, 'Noise - Street/Sidewalk'),

Text(1, 0, 'Blocked Driveway'),

Text(2, 0, 'Illegal Parking'),

Text(3, 0, 'Derelict Vehicle'),

Text(4, 0, 'Noise - Commercial'),

Text(5, 0, 'Noise - House of Worship'),

Text(6, 0, 'Posting Advertisement'),

Text(7, 0, 'Noise - Vehicle'),

Text(8, 0, 'Animal Abuse'),

Text(9, 0, 'Vending'),

Text(10, 0, 'Traffic'),

Text(11, 0, 'Drinking'),

Text(12, 0, 'Bike/Roller/Skate Chronic'),

Text(13, 0, 'Panhandling'),

Text(14, 0, 'Noise - Park'),

Text(15, 0, 'Homeless Encampment'),

Text(16, 0, 'Urinating in Public'),

Text(17, 0, 'Graffiti'),

Text(18, 0, 'Disorderly Youth')]

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 13/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [18]:

dataset['Descriptor'].isna().sum()

Out[18]:

88

In [19]:

dataset['Descriptor'].describe()

Out[19]:

count 4887

unique 38

top No Access

freq 1268

Name: Descriptor, dtype: object

In [20]:

dataset['Descriptor'].value_counts().head(5)

Out[20]:

No Access 1268

Loud Music/Party 661

Posted Parking Sign Violation 413

Partial Access 412

Blocked Hydrant 345

Name: Descriptor, dtype: int64

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 14/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [21]:

plot2=sns.countplot(dataset['Descriptor'])

plot2.set_xticklabels(plot2.get_xticklabels(),rotation=90)

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 15/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

/usr/local/lib/python3.7/site-packages/seaborn/_decorators.py:43: FutureWa
rning: Pass the following variable as a keyword arg: x. From version 0.12,
the only valid positional argument will be `data`, and passing other argum
ents without an explicit keyword will result in an error or misinterpretat
ion.

FutureWarning

Out[21]:

[Text(0, 0, 'Loud Music/Party'),

Text(1, 0, 'No Access'),

Text(2, 0, 'Commercial Overnight Parking'),

Text(3, 0, 'Blocked Sidewalk'),

Text(4, 0, 'Posted Parking Sign Violation'),

Text(5, 0, 'Blocked Hydrant'),

Text(6, 0, 'With License Plate'),

Text(7, 0, 'Partial Access'),

Text(8, 0, 'Unauthorized Bus Layover'),

Text(9, 0, 'Double Parked Blocking Vehicle'),

Text(10, 0, 'Double Parked Blocking Traffic'),

Text(11, 0, 'Vehicle'),

Text(12, 0, 'Loud Talking'),

Text(13, 0, 'Banging/Pounding'),

Text(14, 0, 'Car/Truck Music'),

Text(15, 0, 'Tortured'),

Text(16, 0, 'In Prohibited Area'),

Text(17, 0, 'Congestion/Gridlock'),

Text(18, 0, 'Neglected'),

Text(19, 0, 'Car/Truck Horn'),

Text(20, 0, 'In Public'),

Text(21, 0, 'Other (complaint details)'),

Text(22, 0, 'No Shelter'),

Text(23, 0, 'Truck Route Violation'),

Text(24, 0, 'Unlicensed'),

Text(25, 0, 'Overnight Commercial Storage'),

Text(26, 0, 'Engine Idling'),

Text(27, 0, 'After Hours - Licensed Est'),

Text(28, 0, 'Detached Trailer'),

Text(29, 0, 'Underage - Licensed Est'),

Text(30, 0, 'Chronic Stoplight Violation'),

Text(31, 0, 'Loud Television'),

Text(32, 0, 'Chained'),

Text(33, 0, 'Building'),

Text(34, 0, 'In Car'),

Text(35, 0, 'Police Report Requested'),

Text(36, 0, 'Chronic Speeding'),

Text(37, 0, 'Playing in Unsuitable Place')]

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 16/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [22]:

dataset['Location Type'].isna().sum()

Out[22]:

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 17/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [23]:

dataset['Location Type'].value_counts().head()

Out[23]:

Street/Sidewalk 4173

Store/Commercial 380

Club/Bar/Restaurant 270

Residential Building/House 115

Park/Playground 14

Name: Location Type, dtype: int64

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 18/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [24]:

dataset['Location Type'].fillna(value='Street/Sidewalk',inplace =True)

plot3=sns.countplot(dataset['Location Type'])

plot3.set_xticklabels(plot3.get_xticklabels(),rotation=90)

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 19/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

/usr/local/lib/python3.7/site-packages/seaborn/_decorators.py:43: FutureWa
rning: Pass the following variable as a keyword arg: x. From version 0.12,
the only valid positional argument will be `data`, and passing other argum
ents without an explicit keyword will result in an error or misinterpretat
ion.

FutureWarning

Out[24]:

[Text(0, 0, 'Street/Sidewalk'),

Text(1, 0, 'Club/Bar/Restaurant'),

Text(2, 0, 'Store/Commercial'),

Text(3, 0, 'House of Worship'),

Text(4, 0, 'Residential Building/House'),

Text(5, 0, 'Residential Building'),

Text(6, 0, 'Park/Playground'),

Text(7, 0, 'Vacant Lot'),

Text(8, 0, 'House and Store'),

Text(9, 0, 'Highway'),

Text(10, 0, 'Commercial')]

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 20/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [25]:

dataset['Incident Zip'].value_counts().head()

dataset['Incident Zip'].isna().sum()

dataset['Incident Zip'].fillna(value=11385,inplace=True)

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 21/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [26]:

dataset['Address Type'].value_counts()

dataset['Address Type'].fillna(value='Address',inplace=True)

sns.countplot(dataset['Address Type'])

/usr/local/lib/python3.7/site-packages/seaborn/_decorators.py:43: FutureWa
rning: Pass the following variable as a keyword arg: x. From version 0.12,
the only valid positional argument will be `data`, and passing other argum
ents without an explicit keyword will result in an error or misinterpretat
ion.

FutureWarning

Out[26]:

<AxesSubplot:xlabel='Address Type', ylabel='count'>

In [27]:

dataset.drop(['Latitude', 'Longitude','Location','Y Coordinate (State Plane)','Landmar


k'],axis=1,inplace=True)

In [37]:

dataset.isna().sum()

Out[37]:

Agency 1

Agency Name 1

Complaint Type 1

Descriptor 88

Location Type 0

Incident Zip 0

Address Type 0

City 0

Facility Type 21

Status 1

Due Date 1

Borough 1

Park Facility Name 1

Park Borough 1

Request_Closing_Time 0

dtype: int64

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 22/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [39]:

dataset['City'].value_counts().head()

dataset['Facility Type'].value_counts().head()

Out[39]:

Precinct 4954

Name: Facility Type, dtype: int64

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 23/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [40]:

dataset['City'].fillna(value='BROOKLYN',inplace=True)

dataset['City'].value_counts().head()

plot4=sns.countplot(x=dataset['City'])

plot4.set_xticklabels(plot4.get_xticklabels(),rotation=90)

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 24/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

Out[40]:

[Text(0, 0, 'NEW YORK'),

Text(1, 0, 'ASTORIA'),

Text(2, 0, 'BRONX'),

Text(3, 0, 'ELMHURST'),

Text(4, 0, 'BROOKLYN'),

Text(5, 0, 'KEW GARDENS'),

Text(6, 0, 'JACKSON HEIGHTS'),

Text(7, 0, 'MIDDLE VILLAGE'),

Text(8, 0, 'REGO PARK'),

Text(9, 0, 'SAINT ALBANS'),

Text(10, 0, 'JAMAICA'),

Text(11, 0, 'SOUTH RICHMOND HILL'),

Text(12, 0, 'RIDGEWOOD'),

Text(13, 0, 'HOWARD BEACH'),

Text(14, 0, 'FOREST HILLS'),

Text(15, 0, 'STATEN ISLAND'),

Text(16, 0, 'OZONE PARK'),

Text(17, 0, 'RICHMOND HILL'),

Text(18, 0, 'WOODHAVEN'),

Text(19, 0, 'FLUSHING'),

Text(20, 0, 'CORONA'),

Text(21, 0, 'QUEENS VILLAGE'),

Text(22, 0, 'OAKLAND GARDENS'),

Text(23, 0, 'HOLLIS'),

Text(24, 0, 'MASPETH'),

Text(25, 0, 'EAST ELMHURST'),

Text(26, 0, 'SOUTH OZONE PARK'),

Text(27, 0, 'WOODSIDE'),

Text(28, 0, 'FRESH MEADOWS'),

Text(29, 0, 'LONG ISLAND CITY'),

Text(30, 0, 'ROCKAWAY PARK'),

Text(31, 0, 'SPRINGFIELD GARDENS'),

Text(32, 0, 'COLLEGE POINT'),

Text(33, 0, 'BAYSIDE'),

Text(34, 0, 'GLEN OAKS'),

Text(35, 0, 'FAR ROCKAWAY'),

Text(36, 0, 'BELLEROSE'),

Text(37, 0, 'LITTLE NECK'),

Text(38, 0, 'CAMBRIA HEIGHTS'),

Text(39, 0, 'ROSEDALE'),

Text(40, 0, 'SUNNYSIDE'),

Text(41, 0, 'WHITESTONE'),

Text(42, 0, 'ARVERNE')]

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 25/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [41]:

dataset.isna().sum()

Out[41]:

Agency 1

Agency Name 1

Complaint Type 1

Descriptor 88

Location Type 0

Incident Zip 0

Address Type 0

City 0

Facility Type 21

Status 1

Due Date 1

Borough 1

Park Facility Name 1

Park Borough 1

Request_Closing_Time 0

dtype: int64

4: Order the complaint types based on the average ‘Request_Closing_Time’,


grouping them for different locations.

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 26/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [42]:

dataset['Request_Closing_Time'].head()

Out[42]:

0 0 days 00:55:15

1 0 days 01:26:16

2 0 days 04:51:31

3 0 days 07:45:14

4 0 days 03:27:02

Name: Request_Closing_Time, dtype: timedelta64[ns]

In [43]:

dataset['Request_Closing_Time'].fillna(value=dataset['Request_Closing_Time'].mean(),inp
lace=True)

In [44]:

dataset['Request_Closing_Time'].isna().sum()

Out[44]:

In [48]:

dataset['Request_Closing_Time'].dtypes

Out[48]:

dtype('<m8[ns]')

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 27/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [49]:

dataset.head(10)

Out[49]:

Agency Complaint Incident Address


Agency Descriptor Location Type
Name Type Zip Type

New York
Noise - Loud
0 NYPD City Police Street/Sidewalk 10034.0 ADDRESS NEW
Street/Sidewalk Music/Party
Department

New York
Blocked
1 NYPD City Police No Access Street/Sidewalk 11105.0 ADDRESS AS
Driveway
Department

New York
Blocked
2 NYPD City Police No Access Street/Sidewalk 10458.0 ADDRESS B
Driveway
Department

New York Commercial


3 NYPD City Police Illegal Parking Overnight Street/Sidewalk 10461.0 ADDRESS B
Department Parking

New York
Blocked
4 NYPD City Police Illegal Parking Street/Sidewalk 11373.0 ADDRESS ELM
Sidewalk
Department

Posted
New York
Parking
5 NYPD City Police Illegal Parking Street/Sidewalk 11215.0 ADDRESS BRO
Sign
Department
Violation

New York
Blocked
6 NYPD City Police Illegal Parking Street/Sidewalk 10032.0 ADDRESS NEW
Hydrant
Department

New York
Blocked
7 NYPD City Police No Access Street/Sidewalk 10457.0 ADDRESS B
Driveway
Department

Posted
New York
Parking
8 NYPD City Police Illegal Parking Street/Sidewalk 11415.0 ADDRESS
Sign GAR
Department
Violation

New York
Blocked
9 NYPD City Police No Access Street/Sidewalk 11219.0 ADDRESS BRO
Driveway
Department

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 28/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [50]:

dataset['Status'].value_counts()

Out[50]:

Closed 4953

Open 16

Assigned 5

Name: Status, dtype: int64

In [51]:

sns.countplot(dataset['Status'])

/usr/local/lib/python3.7/site-packages/seaborn/_decorators.py:43: FutureWa
rning: Pass the following variable as a keyword arg: x. From version 0.12,
the only valid positional argument will be `data`, and passing other argum
ents without an explicit keyword will result in an error or misinterpretat
ion.

FutureWarning

Out[51]:

<AxesSubplot:xlabel='Status', ylabel='count'>

In [52]:

#bivariate analysis

#the most common complaint

dataset['Complaint Type'].value_counts().head(6)

Out[52]:

Blocked Driveway 1680

Illegal Parking 1415

Noise - Commercial 611

Noise - Street/Sidewalk 346

Derelict Vehicle 340

Noise - Vehicle 158

Name: Complaint Type, dtype: int64

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 29/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [53]:

dataset.head(3)

Out[53]:

Agency Complaint Incident Address


Agency Descriptor Location Type
Name Type Zip Type

New York
Noise - Loud N
0 NYPD City Police Street/Sidewalk 10034.0 ADDRESS
Street/Sidewalk Music/Party YO
Department

New York
Blocked
1 NYPD City Police No Access Street/Sidewalk 11105.0 ADDRESS ASTO
Driveway
Department

New York
Blocked
2 NYPD City Police No Access Street/Sidewalk 10458.0 ADDRESS BRO
Driveway
Department

In [54]:

desc=dataset.groupby(by='Complaint Type')['Descriptor'].agg('count')

desc

Out[54]:

Complaint Type

Animal Abuse 132

Bike/Roller/Skate Chronic 0

Blocked Driveway 1680

Derelict Vehicle 340

Disorderly Youth 1

Drinking 18

Graffiti 1

Homeless Encampment 0

Illegal Parking 1415

Noise - Commercial 611

Noise - House of Worship 7

Noise - Park 9

Noise - Street/Sidewalk 346

Noise - Vehicle 158

Panhandling 0

Posting Advertisement 31

Traffic 42

Urinating in Public 0

Vending 96

Name: Descriptor, dtype: int64

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 30/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [55]:

#City with their status

dataset.loc[dataset['City']=='NEW YORK',]['Borough'].value_counts()

Out[55]:

MANHATTAN 885

Name: Borough, dtype: int64

In [56]:

#Newyork city has how many boroughs and whats their status

sns.countplot(x=dataset.loc[dataset['City']=='NEW YORK',]['Borough'],hue='Status',data=
dataset)

Out[56]:

<AxesSubplot:xlabel='Borough', ylabel='count'>

In [57]:

#Newyork city has max complaints of which complaint type?

dataset.loc[dataset['City']=='NEW YORK',:]['Complaint Type'].value_counts()

Out[57]:

Noise - Commercial 254

Illegal Parking 208

Noise - Street/Sidewalk 147

Vending 74

Noise - Vehicle 58

Homeless Encampment 55

Blocked Driveway 28

Animal Abuse 25

Derelict Vehicle 11

Traffic 11

Noise - Park 4

Panhandling 3

Drinking 2

Urinating in Public 2

Noise - House of Worship 2

Bike/Roller/Skate Chronic 1

Name: Complaint Type, dtype: int64

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 31/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [58]:

#Countplot to show Newyork city has max complaints of which complaint type?

plot=sns.countplot(x=dataset.loc[dataset['City']=='NEW YORK',:]['Complaint Type'])

plot.set_xticklabels(plot.get_xticklabels(),rotation = 90)

Out[58]:

[Text(0, 0, 'Noise - Street/Sidewalk'),

Text(1, 0, 'Illegal Parking'),

Text(2, 0, 'Noise - House of Worship'),

Text(3, 0, 'Noise - Commercial'),

Text(4, 0, 'Blocked Driveway'),

Text(5, 0, 'Vending'),

Text(6, 0, 'Noise - Vehicle'),

Text(7, 0, 'Panhandling'),

Text(8, 0, 'Animal Abuse'),

Text(9, 0, 'Noise - Park'),

Text(10, 0, 'Homeless Encampment'),

Text(11, 0, 'Traffic'),

Text(12, 0, 'Derelict Vehicle'),

Text(13, 0, 'Drinking'),

Text(14, 0, 'Urinating in Public'),

Text(15, 0, 'Bike/Roller/Skate Chronic')]

In [59]:

#Avg time taken to solve a case in Newyork city

dataset.loc[(dataset['City']=='NEW YORK')&(dataset['Status']=='Closed'),:]['Request_Clo
sing_Time'].mean()

Out[59]:

Timedelta('0 days 02:54:44.818799546')

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 32/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [60]:

dataset.loc[(dataset['City']=='NEW YORK')&(dataset['Status']=='Closed'),:]['Request_Clo
sing_Time'].std()

Out[60]:

Timedelta('0 days 03:43:53.801186422')

In [61]:

dataset['Borough'].value_counts()

Out[61]:

BROOKLYN 1641

QUEENS 1495

MANHATTAN 885

BRONX 684

STATEN ISLAND 240

Unspecified 29

Name: Borough, dtype: int64

In [62]:

dataset['Location Type'].value_counts()

Out[62]:

Street/Sidewalk 4174

Store/Commercial 380

Club/Bar/Restaurant 270

Residential Building/House 115

Park/Playground 14

Residential Building 8

House of Worship 7

House and Store 3

Vacant Lot 2

Highway 1

Commercial 1

Name: Location Type, dtype: int64

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 33/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [63]:

#Top Location type and their countplot with hues='Borough'

sns.countplot(dataset.loc[dataset['Location Type'].isin(['Street/Sidewalk','Store/Comme
rcial','Club/Bar/Restaurant'])]

['Location Type'],data=dataset,hue='Borough')

/usr/local/lib/python3.7/site-packages/seaborn/_decorators.py:43: FutureWa
rning: Pass the following variable as a keyword arg: x. From version 0.12,
the only valid positional argument will be `data`, and passing other argum
ents without an explicit keyword will result in an error or misinterpretat
ion.

FutureWarning

Out[63]:

<AxesSubplot:xlabel='Location Type', ylabel='count'>

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 34/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [64]:

import datetime

dataset['year'] = pd.DatetimeIndex(dataset['Due Date']).year

dataset.head()

Out[64]:

Agency Complaint Incident Address


Agency Descriptor Location Type
Name Type Zip Type

New York
Noise - Loud
0 NYPD City Police Street/Sidewalk 10034.0 ADDRESS NEW
Street/Sidewalk Music/Party
Department

New York
Blocked
1 NYPD City Police No Access Street/Sidewalk 11105.0 ADDRESS AS
Driveway
Department

New York
Blocked
2 NYPD City Police No Access Street/Sidewalk 10458.0 ADDRESS B
Driveway
Department

New York Commercial


3 NYPD City Police Illegal Parking Overnight Street/Sidewalk 10461.0 ADDRESS B
Department Parking

New York
Blocked
4 NYPD City Police Illegal Parking Street/Sidewalk 11373.0 ADDRESS ELM
Sidewalk
Department

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 35/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [65]:

sns.countplot(dataset['year'],hue='Borough',data=dataset)

/usr/local/lib/python3.7/site-packages/seaborn/_decorators.py:43: FutureWa
rning: Pass the following variable as a keyword arg: x. From version 0.12,
the only valid positional argument will be `data`, and passing other argum
ents without an explicit keyword will result in an error or misinterpretat
ion.

FutureWarning

Out[65]:

<AxesSubplot:xlabel='year', ylabel='count'>

In [66]:

dataset['Location Type'].value_counts()

Out[66]:

Street/Sidewalk 4174

Store/Commercial 380

Club/Bar/Restaurant 270

Residential Building/House 115

Park/Playground 14

Residential Building 8

House of Worship 7

House and Store 3

Vacant Lot 2

Highway 1

Commercial 1

Name: Location Type, dtype: int64

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 36/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [67]:

#Display the complaint type and city together

dataset[['Complaint Type','City']].head()

Out[67]:

Complaint Type City

0 Noise - Street/Sidewalk NEW YORK

1 Blocked Driveway ASTORIA

2 Blocked Driveway BRONX

3 Illegal Parking BRONX

4 Illegal Parking ELMHURST

In [68]:

#Find the top 10 complaint types

dataset['Complaint Type'].value_counts()[0:10,]

Out[68]:

Blocked Driveway 1680

Illegal Parking 1415

Noise - Commercial 611

Noise - Street/Sidewalk 346

Derelict Vehicle 340

Noise - Vehicle 158

Animal Abuse 132

Vending 96

Homeless Encampment 75

Traffic 42

Name: Complaint Type, dtype: int64

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 37/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [69]:

#Plot a bar graph of count vs. complaint types

plot3=sns.countplot(dataset['Complaint Type'])

plot3.set_xticklabels(plot3.get_xticklabels(),rotation =90)

/usr/local/lib/python3.7/site-packages/seaborn/_decorators.py:43: FutureWa
rning: Pass the following variable as a keyword arg: x. From version 0.12,
the only valid positional argument will be `data`, and passing other argum
ents without an explicit keyword will result in an error or misinterpretat
ion.

FutureWarning

Out[69]:

[Text(0, 0, 'Noise - Street/Sidewalk'),

Text(1, 0, 'Blocked Driveway'),

Text(2, 0, 'Illegal Parking'),

Text(3, 0, 'Derelict Vehicle'),

Text(4, 0, 'Noise - Commercial'),

Text(5, 0, 'Noise - House of Worship'),

Text(6, 0, 'Posting Advertisement'),

Text(7, 0, 'Noise - Vehicle'),

Text(8, 0, 'Animal Abuse'),

Text(9, 0, 'Vending'),

Text(10, 0, 'Traffic'),

Text(11, 0, 'Drinking'),

Text(12, 0, 'Bike/Roller/Skate Chronic'),

Text(13, 0, 'Panhandling'),

Text(14, 0, 'Noise - Park'),

Text(15, 0, 'Homeless Encampment'),

Text(16, 0, 'Urinating in Public'),

Text(17, 0, 'Graffiti'),

Text(18, 0, 'Disorderly Youth')]

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 38/39
7/30/22, 10:51 PM Assignment 1_NYC 311 service request

In [70]:

#Display the major complaint types and their count

#top 5 complaint types

series=dataset['Complaint Type'].value_counts()[0:5,]

series.nlargest().index

Out[70]:

Index(['Blocked Driveway', 'Illegal Parking', 'Noise - Commercial',

'Noise - Street/Sidewalk', 'Derelict Vehicle'],

dtype='object')

In [71]:

#graph

plot4=sns.barplot(x=series.nlargest().index,y=series.nlargest().values)

plot4.set_xticklabels(plot4.get_xticklabels(),rotation =90)

Out[71]:

[Text(0, 0, 'Blocked Driveway'),

Text(1, 0, 'Illegal Parking'),

Text(2, 0, 'Noise - Commercial'),

Text(3, 0, 'Noise - Street/Sidewalk'),

Text(4, 0, 'Derelict Vehicle')]

https://lms.simplilearn.com/courses/2772/Data-Science-with-Python/practice-labs 39/39

You might also like