Documents made up using a collection of images and
Actual documents where the text can be selected, copied etc.
This script takes a different approach to both of them:
Documents consisting of a collection of images is straightforward and this script will
simply download the induvidual images which can be combined to .pdf by passing --pdf option to the tool. Simple. Actual documents where the text can be selected are hard to tackle. If we feed such a document to this tool, only the text present in document will be downloaded. Scribd seems to use javascript to somehow combine text and images. So far, I haven't been able to combine them with Python in a way they look like the ori