Professional Documents
Culture Documents
GETTING DATA
AGENDA 2
I. GETTING DATA
II. REGEX / REQUESTS
III. API / WRAPPERS
INTRO TO DATA SCIENCE
I. GETTING DATA
GETTING DATA 4
II. REGEX /
REQUESTS
REGEX / REQUESTS 6
REGular EXpressions
are how we capture patterns in text
WHO IS A DATA SCIENTIST? 7
‣ REGEX / REQUESTS 8
BEAUTIFULSO
UP
Is a python based HTML parser.
‣ REGEX / REQUESTS 10
WEB
CRAWLERS
We just built one!
WEB
CRAWLERS
We just built one!
But be careful….
Hacking OKCupid: http://www.wired.com/2014/01/how-to-hack-okcupid/all/
INTRO TO DATA SCIENCE
http://www.pythonforbeginners.com/api/list-of-python-apis
‣ API / WRAPPERS 14
API (n):
Application Programming Interface
Examples of API’s:
Examples of API’s:
http://www.pythonforbeginners.com/api/list-of-python-apis
‣ API / WRAPPERS 18
Conclusion
Data is all over the web, but we must be
polite and conscious of what data is
available to us.