You are on page 1of 1

Some information about Scribd documents:

There are two types of documents on Scribd:

 Documents made up using a collection of images and


 Actual documents where the text can be selected, copied etc.

This script takes a different approach to both of them:

 Documents consisting of a collection of images is straightforward and this script will


simply download the induvidual images which can be combined to .pdf by passing
--pdf option to the tool. Simple.
 Actual documents where the text can be selected are hard to tackle. If we feed such a
document to this tool, only the text present in document will be downloaded. Scribd
seems to use javascript to somehow combine text and images. So far, I haven't been able
to combine them with Python in a way they look like the ori

You might also like