You are on page 1of 11

AiMunshi

Date: 29/10/2019 The Document Extractor Tool


• Introduction & Features
• Solution Approach
• Technical Flow
• Technology Stack
• Reporting Analytics
• Case Study
Digital financial reader of key The tool processes the The whole process is
documents namely – invoice, documents in PDF form, automated using Robotic
credit note, statements etc. extracts & stores required Process Automation (RPA)
information in desired format. tool.
The Tool Automatic Info processing Required info extraction
from email The tool processes the
• Increases efficiency of Financial documents namely –
Invoice/credit note pdf's and
extracts the required fields and
Invoices, Credit Notes, which are line items details as identified by
Invoices’ processing and sent via email, are accessed and
stored into desired location in the
the accounts/finance dept.

reduces errors of manual file system using Robotic Process


Automation tool (UiPath).

processing Database Storage:


• Extracts required fields and Output format of choice
The tool stores the extracted
information into databases of choice.
updated related systems The user can generate the
extracted information in any of
The major fields are stored as
individual columns whilst the line

• Creates logical confidence these desired formats - JSON, CSV


and PDF etc. & store in the output
items are stored in JSON fields

file system at the desired path.


levels to audit processed
invoices Establishing Processing Confidence
End to end Automation Levels
• Detailed Analytics and The entire process starting from
attachment processing from email,
Based on the results of the document extraction,
the processed files are categorized into 3-
Reports to storage, extraction and storage of
info is automated & orchestrated by
confidence level i.e. Low, Medium and High –
the details of which can be referred in appendix.
Robotic Process Automation tool –
UiPath.
1 2 3

Deep Required Values


Learning
Financial
PDFs
JSON
Database

Reading the PDF invoice Predict & Create JSON file Post required field value

I. We receive the PDFs from I. Using Deep Learning based


either API calls or from Email Computer Vision model, I. We also provide Invoice
II. Perform data preprocessing identify & learn the invoice
data with confidence
on these PDFs and prepare pattern
them for ingestion by our II. Extract the pattern content & level & error log in
deep learning model. store in JSON format Database
Input
Model

Extracts required fields and line items


Output

Folder structure
JSON
Database Processed files

Valid data Email notification to admin


Invalid data

When any fields and line items are missing


Invoice data Error log
Deep Learning User
Input Environment Data Storage Consumption
Processed Doc Analytics Confidence Level Matrix

Tabular User Admin


Reports Console

Error Log Analytics Customizable Filters


Business Problem:
• The largest pharma retail company in Australia, having more than 800 stores
• Over 1,60,000 + monthly invoices, credit notes of different formats from different
vendors
• Users read the invoices/credit notes manually and insert the data into ERP systems
• There are human errors , typos etc whilst the data being entered into ERP
• This process was tedious and time consuming

Our Solution:
• Automated the processing of invoices/credit notes from emails and file systems
• Workflow is built to automate the file reading, extraction and validating the data
• The output is provided in xlsx/JSON format
• Data is automatically inserted in ERP for consumption of downstream system
• Improved the ROI by ~64%
• Accuracy of the data went up to ~99%
• Value addition of predicting missing data using data imputation techniques
CONTACT : aiml-sales@aibridgeml.com

FB : https://www.facebook.com/AIBridgeML

TWITTER : https://twitter.com/AIBridgeML

WEBSITE: http://aibridgeml.com

PHONE NUMBER : +91 40 4640 0400

THANK YOU
APPENDIX
LOW CONFIDENCE MEDIUM CONFIDENCE HIGH CONFIDENCE

If any of the following fields cannot If only one of the following fields If all required fields and all line items
be extracted: cannot be extracted: are extracted.
 Branch ID  Invoice Number
 Vendor Code  Invoice Date
 Total
OR  Sub Total
 GST
If any two or more of the following
fields cannot be extracted: OR
 Invoice Number
 Invoice Date If all required fields are extracted but
 Total line items cannot be extracted.
 Sub Total
 GST

You might also like