You are on page 1of 6

Updated: 4/15/2022

ADOBE PDF EXTRACT API TUTORIAL


Requirements:
Install Python (Windows Store Version is easier…)
Install MS Visual Studio Code
Get credentials from Adobe Extract API from HERE

Pre-Setup
1. MAKING SURE PYTHON IS IN A PROPER PATH
Enter run into search tab on taskbar on windows
Copy and paste sysdm.cpl into search bar
Go to environmental variables, ensure you input the path to your python scripts.

Go to the Advanced tab.


Open environment variables.

1. Look for Python as a path variable in User Variables


2. Delete if in System variables
Input something like this into path variable window:
C:\Users\UAL-Laptop\AppData\Local\Programs\Python\Python310\Scripts
To get there, you may have to %Appdata%
in search bar, go back to local (not roaming)->programs->Python310->Scripts
A cmd line way to do this (if you are more familiar)
setx PATH “%PATH%; C:\Python310\Scripts
Restart cmd, echo %PATH%, look for Python

2. Getting your Adobe Credential Folder

Clicking the link at the beginning and logging into your Adobe account, you should be presented
with the following (above).
- Input a name
- Click Python extract
- Agree to terms

Here’s where I have placed my credentials folder path-wise.


3. Downloading Joe’s (QuantAQ) modified code!
Here is a copy of my python test folder

Download the python code to your folder!

Notice how I have placed the code OUTSIDE of the adobe folder, for easy access.

You will also need to download the requirements.txt file from my folder, or from the GitHub
4. Opening the script in Visual Studio Code
In MS Visual Studio Code (once installed) open the pdf_extract.py file

Click open file and find the pdf_extract.py

5. Understanding the critical parts of the script

This is where you will put your own information/path regarding where you are keeping the
Adobe folder.

In the Adobe folder, you place the PDFs you want to work on in the resources folder. You will get your
data out in the output folder.
List the names, in quotes and with commas, the files you want to use (name them something easy) in
listfiles. The list output name is the folder name with your extracted data.

Preparing Python and Script

Change directories in Visual Studio code to that of the folder containing the script
Use the terminal and type:
cd .. to move backwards
and cd Path.name.here to move forwards

Once it shows something like…

Run the following command:


python -m pip install pdfservices-sdk
(wait for it to finish)

Run the following command:


python -m pip install -r requirements.txt
(again wait for it)

You should be ready to run the code now!!


CHECK THAT YOU:
Spelled the name of your PDFs correctly in the code
All paths in the code are relevant to your directories
Python path is good (no failures from above commands)

Use the Run button on top of Visual Studio code, do without debugging.

(A successful run will show you this!)

Go to outputs in your Adobe folder and get your data. Sigh in relief.

You might also like