Professional Documents
Culture Documents
Basic Web Scraping
Basic Web Scraping
without
programming
Mark Walker of The New York Times
What is Web
Scraping?
(Don’t you’ve been
doing it all along)
scraping is….
-- Web scraping, also known as web data extraction, is the
process of retrieving or “scraping” data from a website.
The Formula
Breaking
down the
Formula
URL
URL - The URL of the page to examine, including protocol (e.g.
http://).
The indices for lists and tables are maintained separately, so there may be
both a list and a table with index 0 if both types of elements exist on the
HTML page.
Other Formulas
IMPORTXML: Imports data from any of various structured data types including XML,
HTML, CSV, TSV, and RSS and ATOM XML feeds.
IMPORTDATA: Imports data at a given url in .csv (comma-separated value) or .tsv (tab-
separated value) format.