You are on page 1of 2

12/28/2016

Scrapy|AFastandPowerfulScrapingandWebCrawlingFramework
Download

Documentation

Community

Companies

Commercial Support

Fork on Github

FAQ

Install the latest version of Scrapy

Scrapy 1.2
pip install scrapy

An open source and collaborative framework


for extracting the data you need from websites.
In a fast, simple, yet extensible way.

pypi v1.3.0

wheel yes

PyPI

Conda

Source

coverage 83%

Build and run your


web spiders

Terminal
pipinstallscrapy
cat>myspider.py<<EOF
importscrapy
classBlogSpider(scrapy.Spider):
name='blogspider'
start_urls=['https://blog.scrapinghub.com']
defparse(self,response):
fortitleinresponse.css('h2.entrytitle'):
yield{'title':title.css('a::text').extract_first()}
next_page=response.css('div.prevpost>a::attr(href)').extract_first()
ifnext_page:
yieldscrapy.Request(response.urljoin(next_page),callback=self.parse)
EOF
scrapyrunspidermyspider.py

Deploy them to
Scrapy Cloud
or use Scrapyd to host the spiders on your
own server

Terminal
shublogin
InsertyourScrapinghubAPIKey:<API_KEY>
#DeploythespidertoScrapyCloud
shubdeploy
#Schedulethespiderforexecution
shubscheduleblogspider
Spiderblogspiderscheduled,watchitrunninghere:
https://app.scrapinghub.com/p/26731/job/1/8
#Retrievethescrapeddata
shubitems26731/1/8
{"title":"ImprovedFrontera:WebCrawlingatScalewithPython3Support"}
{"title":"HowtoCrawltheWebPolitelywithScrapy"}
...

Fast and powerful

Easily extensible

Portable, Python

write the rules to extract the data and let


Scrapy do the rest

extensible by design, plug new functionality


easily without having to touch the core

written in Python and runs on Linux,


Windows, Mac and BSD

https://scrapy.org/

1/2

12/28/2016

Scrapy|AFastandPowerfulScrapingandWebCrawlingFramework

Healthy community

Want to know more?

- 17k stars, 4.7k forks and 1.3k watchers on GitHub

- Discover Scrapy at a glance

- 3.3k followers on Twitter

- Meet the companies using Scrapy

- 6.4k questions on StackOverow


- 3k members on mailing list

Star

17,564

Fork

4,836

Maintained by Scrapinghub and many other contributors

https://scrapy.org/

2/2

You might also like