You are on page 1of 1

Name: John Alexis A.

Ferrer
Course and Section: CPE019-CPE32S3
Date of Submission: April 06, 2021
Instructor: Engr. Jonathan V. Taylar

In [27]: import numpy as np


import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import folium
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline

In [26]: !pip install folium

Requirement already satisfied: folium in c:\programdata\anaconda3\lib\site-packages (0.12.1)


Requirement already satisfied: jinja2>=2.9 in c:\programdata\anaconda3\lib\site-packages (from folium) (2.11.2)
Requirement already satisfied: branca>=0.3.0 in c:\programdata\anaconda3\lib\site-packages (from folium) (0.4.2)
Requirement already satisfied: numpy in c:\programdata\anaconda3\lib\site-packages (from folium) (1.19.2)
Requirement already satisfied: requests in c:\programdata\anaconda3\lib\site-packages (from folium) (2.24.0)
Requirement already satisfied: MarkupSafe>=0.23 in c:\programdata\anaconda3\lib\site-packages (from jinja2>=2.9->folium) (1.1.1)
Requirement already satisfied: idna<3,>=2.5 in c:\programdata\anaconda3\lib\site-packages (from requests->folium) (2.10)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in c:\programdata\anaconda3\lib\site-packages (from requests->folium) (1.25.11)
Requirement already satisfied: certifi>=2017.4.17 in c:\programdata\anaconda3\lib\site-packages (from requests->folium) (2020.6.20)
Requirement already satisfied: chardet<4,>=3.0.2 in c:\programdata\anaconda3\lib\site-packages (from requests->folium) (3.0.4)

In [10]: import pandas as pd


pse = pd.read_csv('Police_Department_Incidents_-_Previous_Year__2016_.csv')

In [16]: pse.head()

Out[16]: IncidntNum Category Descript DayOfWeek Date Time PdDistrict Resolution Address X Y Location PdId

0 120058272 WEAPON LAWS POSS OF PROHIBITED WEAPON Friday 01/29/2016 12:00:00 AM 11:00 SOUTHERN ARREST, BOOKED 800 Block of BRYANT ST -122.403405 37.775421 (37.775420706711, -122.403404791479) 12005827212120

1 120058272 WEAPON LAWS FIREARM, LOADED, IN VEHICLE, POSSESSION OR USE Friday 01/29/2016 12:00:00 AM 11:00 SOUTHERN ARREST, BOOKED 800 Block of BRYANT ST -122.403405 37.775421 (37.775420706711, -122.403404791479) 12005827212168

2 141059263 WARRANTS WARRANT ARREST Monday 04/25/2016 12:00:00 AM 14:59 BAYVIEW ARREST, BOOKED KEITH ST / SHAFTER AV -122.388856 37.729981 (37.7299809672996, -122.388856204292) 14105926363010

3 160013662 NON-CRIMINAL LOST PROPERTY Tuesday 01/05/2016 12:00:00 AM 23:50 TENDERLOIN NONE JONES ST / OFARRELL ST -122.412971 37.785788 (37.7857883766888, -122.412970537591) 16001366271000

4 160002740 NON-CRIMINAL LOST PROPERTY Friday 01/01/2016 12:00:00 AM 00:30 MISSION NONE 16TH ST / MISSION ST -122.419672 37.765050 (37.7650501214668, -122.419671780296) 16000274071000

In [14]: pd.set_option('display.max_rows',10)
pse

Out[14]: IncidntNum Category Descript DayOfWeek Date Time PdDistrict Resolution Address X Y Location PdId

0 120058272 WEAPON LAWS POSS OF PROHIBITED WEAPON Friday 01/29/2016 12:00:00 AM 11:00 SOUTHERN ARREST, BOOKED 800 Block of BRYANT ST -122.403405 37.775421 (37.775420706711, -122.403404791479) 12005827212120

1 120058272 WEAPON LAWS FIREARM, LOADED, IN VEHICLE, POSSESSION OR USE Friday 01/29/2016 12:00:00 AM 11:00 SOUTHERN ARREST, BOOKED 800 Block of BRYANT ST -122.403405 37.775421 (37.775420706711, -122.403404791479) 12005827212168

2 141059263 WARRANTS WARRANT ARREST Monday 04/25/2016 12:00:00 AM 14:59 BAYVIEW ARREST, BOOKED KEITH ST / SHAFTER AV -122.388856 37.729981 (37.7299809672996, -122.388856204292) 14105926363010

3 160013662 NON-CRIMINAL LOST PROPERTY Tuesday 01/05/2016 12:00:00 AM 23:50 TENDERLOIN NONE JONES ST / OFARRELL ST -122.412971 37.785788 (37.7857883766888, -122.412970537591) 16001366271000

4 160002740 NON-CRIMINAL LOST PROPERTY Friday 01/01/2016 12:00:00 AM 00:30 MISSION NONE 16TH ST / MISSION ST -122.419672 37.765050 (37.7650501214668, -122.419671780296) 16000274071000

... ... ... ... ... ... ... ... ... ... ... ... ... ...

150495 161061000 ASSAULT BATTERY Friday 12/30/2016 12:00:00 AM 21:01 PARK NONE OAK ST / STANYAN ST -122.453982 37.771428 (37.7714278595913, -122.453981622365) 16106100004134

150496 176000742 NON-CRIMINAL LOST PROPERTY Friday 12/30/2016 12:00:00 AM 08:00 CENTRAL NONE JACKSON ST / SANSOME ST -122.401857 37.796626 (37.7966261239618, -122.401857374739) 17600074271000

150497 176000758 LARCENY/THEFT PETTY THEFT OF PROPERTY Thursday 12/29/2016 12:00:00 AM 20:00 CENTRAL NONE PINE ST / TAYLOR ST -122.412269 37.790673 (37.7906727649886, -122.41226909106) 17600075806372

150498 176000764 LARCENY/THEFT GRAND THEFT OF PROPERTY Friday 12/30/2016 12:00:00 AM 10:00 CENTRAL NONE 200 Block of STOCKTON ST -122.406659 37.788275 (37.7882745285785, -122.406658711008) 17600076406374

150499 179002868 OTHER OFFENSES FRAUDULENT GAME OR TRICK, OBTAINING MONEY OR P... Friday 12/02/2016 12:00:00 AM 14:00 SOUTHERN NONE 800 Block of BRYANT ST -122.403405 37.775421 (37.775420706711, -122.403404791479) 17900286809024

150500 rows × 13 columns

In [19]: pse.columns

Out[19]: Index(['IncidntNum', 'Category', 'Descript', 'DayOfWeek', 'Date', 'Time',


'PdDistrict', 'Resolution', 'Address', 'X', 'Y', 'Location', 'PdId'],
dtype='object')

In [20]: len(pse)

Out[20]: 150500

How many variables are contained in the data frame?

In [22]: print('The variables that contained in the data frame is: {}'
.format(len(pse.columns)))

The variables that contained in the data frame is: 13

In [23]: pse['Month'] = pse['Date'].apply(lambda row: int(row[0:2]))


pse['Day'] = pse['Date'].apply(lambda row: int(row[3:5]))

In [24]: print(pse['Month'][0:2])
print(pse['Day'][0:2])

0 1
1 1
Name: Month, dtype: int64
0 29
1 29
Name: Day, dtype: int64

In [28]: print(type(pse['Month'][0]))

<class 'numpy.int64'>

In [30]: del pse['IncidntNum']

In [31]: pse.drop('Location', axis=1, inplace=True )

In [32]: pse.columns

Out[32]: Index(['Category', 'Descript', 'DayOfWeek', 'Date', 'Time', 'PdDistrict',


'Resolution', 'Address', 'X', 'Y', 'PdId', 'Month', 'Day'],
dtype='object')

In [35]: CountCategory = pse['Category'].value_counts()


print(CountCategory)

LARCENY/THEFT 40409
OTHER OFFENSES 19599
NON-CRIMINAL 17866
ASSAULT 13577
VANDALISM 8589
...
SEX OFFENSES, NON FORCIBLE 40
BAD CHECKS 34
GAMBLING 20
PORNOGRAPHY/OBSCENE MAT 4
TREA 3
Name: Category, Length: 39, dtype: int64

In [36]: pse['Category'].value_counts(ascending=True)

Out[36]: TREA 3
PORNOGRAPHY/OBSCENE MAT 4
GAMBLING 20
BAD CHECKS 34
SEX OFFENSES, NON FORCIBLE 40
...
VANDALISM 8589
ASSAULT 13577
NON-CRIMINAL 17866
OTHER OFFENSES 19599
LARCENY/THEFT 40409
Name: Category, Length: 39, dtype: int64

In [37]: print(pse['Category'].value_counts(ascending=True))

TREA 3
PORNOGRAPHY/OBSCENE MAT 4
GAMBLING 20
BAD CHECKS 34
SEX OFFENSES, NON FORCIBLE 40
...
VANDALISM 8589
ASSAULT 13577
NON-CRIMINAL 17866
OTHER OFFENSES 19599
LARCENY/THEFT 40409
Name: Category, Length: 39, dtype: int64

In [47]: most_case = pse['PdDistrict'].value_counts(ascending=True)[-1:]


print(most_case)

SOUTHERN 28445
Name: PdDistrict, dtype: int64

In [45]: AugustCrimes = pse[pse['Month'] == 8]


print('This is the incidents for month of August: {}'.format(len(AugustCrimes)))

This is the incidents for month of August: 12428

In [48]: AugustCrimes = pse[pse['Month'] == 8]


AugustCrimesB = pse[pse['Category'] == 'BURGLARY']
print('Bulgaries Incident | Month of August: {}'.format(len(AugustCrimesB)))

Bulgaries Incident | Month of August: 5802

In [51]: Crime0704 = pse.query('Month == 7 and Day == 4')


Crime0704

Out[51]: Category Descript DayOfWeek Date Time PdDistrict Resolution Address X Y PdId Month Day

2868 OTHER OFFENSES TRAFFIC VIOLATION Monday 07/04/2016 12:00:00 AM 12:54 PARK ARREST, BOOKED PARNASSUS AV / 4TH AV -122.460843 37.762628 16054065965015 7 4

2869 DRUG/NARCOTIC POSSESSION OF MARIJUANA Monday 07/04/2016 12:00:00 AM 12:30 TENDERLOIN ARREST, BOOKED HYDE ST / GOLDEN GATE AV -122.415508 37.781654 16054066516010 7 4

2870 DRUG/NARCOTIC POSSESSION OF NARCOTICS PARAPHERNALIA Monday 07/04/2016 12:00:00 AM 12:30 TENDERLOIN ARREST, BOOKED HYDE ST / GOLDEN GATE AV -122.415508 37.781654 16054066516710 7 4

2871 OTHER OFFENSES RESISTING ARREST Monday 07/04/2016 12:00:00 AM 12:30 TENDERLOIN ARREST, BOOKED HYDE ST / GOLDEN GATE AV -122.415508 37.781654 16054066527170 7 4

2872 WARRANTS ENROUTE TO PAROLE OFFICER Monday 07/04/2016 12:00:00 AM 13:06 BAYVIEW ARREST, BOOKED 0 Block of DAKOTA ST -122.395513 37.753618 16054067162030 7 4

... ... ... ... ... ... ... ... ... ... ... ... ... ...

150314 OTHER OFFENSES VIOLATION OF MUNICIPAL CODE Monday 07/04/2016 12:00:00 AM 17:07 CENTRAL ARREST, BOOKED 800 Block of THE EMBARCADERONORTH ST -122.400387 37.802574 16054121530200 7 4

150315 NON-CRIMINAL DEATH REPORT, CAUSE UNKNOWN Monday 07/04/2016 12:00:00 AM 16:08 TENDERLOIN NONE 300 Block of ELLIS ST -122.411988 37.785023 16054122161030 7 4

150316 DRUNKENNESS UNDER INFLUENCE OF ALCOHOL IN A PUBLIC PLACE Monday 07/04/2016 12:00:00 AM 20:18 SOUTHERN NONE 800 Block of BRYANT ST -122.403405 37.775421 16054172619090 7 4

150317 MISSING PERSON MISSING ADULT Monday 07/04/2016 12:00:00 AM 20:00 MISSION NONE 1100 Block of GUERRERO ST -122.422802 37.752376 16054173274000 7 4

150475 NON-CRIMINAL LOST PROPERTY Monday 07/04/2016 12:00:00 AM 22:20 CENTRAL NONE BAY ST / VANNESS AV -122.425111 37.804146 16614800171000 7 4

352 rows × 13 columns

In [52]: pse.columns

Out[52]: Index(['Category', 'Descript', 'DayOfWeek', 'Date', 'Time', 'PdDistrict',


'Resolution', 'Address', 'X', 'Y', 'PdId', 'Month', 'Day'],
dtype='object')

In [54]: plt.plot(pse['X'],pse['Y'], 'ro')


plt.show()

In [69]: pd_districts = pd.unique(pse['PdDistrict'])


pd_districts_levels = dict(zip(pd_districts, range(len(pd_districts))))
print(pd_districts_levels)

{'SOUTHERN': 0, 'BAYVIEW': 1, 'TENDERLOIN': 2, 'MISSION': 3, 'NORTHERN': 4, 'TARAVAL': 5, 'INGLESIDE': 6, 'CENTRAL': 7, 'RICHMOND': 8, 'PARK': 9, nan: 10}

In [70]: pse['PdDistrictCode'] = pse['PdDistrict'].apply(lambda row: pd_districts_levels[


row])

In [72]: plt.scatter(pse['X'], pse['Y'], c=pse['PdDistrictCode'])


plt.show()

In [74]: from matplotlib import colors


districts = pd.unique(pse['PdDistrict'])
print(list(colors.cnames.values())[0:len(districts)])

['#F0F8FF', '#FAEBD7', '#00FFFF', '#7FFFD4', '#F0FFFF', '#F5F5DC', '#FFE4C4', '#000000', '#FFEBCD', '#0000FF', '#8A2BE2']

In [81]: color_dict = dict(zip(districts, list(colors.cnames.values())


[0:-1:len(districts)]))
print(color_dict)

{'SOUTHERN': '#F0F8FF', 'BAYVIEW': '#A52A2A', 'TENDERLOIN': '#008B8B', 'MISSION': '#E9967A', 'NORTHERN': '#1E90FF', 'TARAVAL': '#ADFF2F', 'INGLESIDE': '#FFFACD', 'CENTRAL': '#87CEFA', 'RICHMOND': '#0000CD', 'PARK': '#FFE4B5', na
n: '#AFEEEE'}

In [79]: map_osm = folium.Map(location=[pse['Y'].mean(), pse['X'].mean()],


zoom_start = 12)
plotEvery = 50
obs = list(zip( pse['Y'], pse['X'], pse['PdDistrict']))
for el in obs[0:-1:plotEvery]:

folium.CircleMarker(el[0:2], color=color_dict[el[2]],
fill_color=el[2],radius=10).add_to(map_osm)

In [80]: map_osm

Out[80]: Make this Notebook Trusted to load map: File -> Trust Notebook
+

Leaflet | Data by © OpenStreetMap, under ODbL.

Part 2 : Load the Data How many variables are contained in the SF data frame (ignore the Index)? 13

Part 4 : Analyze the Data What type of crime was committed the most? Larceny/Theft

Which PdDistrict had the most incidents of reported crime? Provide the Python command(s) used to support your answer. Southern

How many crime incidents were there for the month of August? 12428

How many crime incidents were there for the month of August? 5802

Conclusion: The skills test shows my data processing expertise by applying it to python and jupyter notebook. Using the provided pdf and csv files, I was able to complete the job. I began by loading the San Francisco data and importing the Python packages. Then I examine the data
collection containing crime data from San Francisco and respond to the questions on the pdf file. I also plot the crime data on a map of San Francisco. Finally, I added the map packages that improve the map's graph.

In [ ]:

You might also like