Professional Documents
Culture Documents
Document Separation tab is used to manage how multi-page images are separated into single
documents or loose pages grouped into multi-page documents. When this feature is enabled, the
Kofax Transformation Modules - Server performs document separation before extraction. The
following options define how the server handles unclassified pages.
No document separation. Select this option to disable document separation for this
project. This option is selected by default.
Standard Document Separation. Select this option to use the class properties and
project settings to determine how documents are separated. This option is cleared by
default.
Duplex scan mode (front and back side are never split). This option is disabled unless
Standard Document Separation is selected.
Select this option if you have two-sided pages. The back side is ignored. This option is
cleared by default.
Unclassified pages should be handled as. This option is disabled unless Standard
Document Separation is selected.
This option defines the handling when document separation processes a page that could not
be classified using the following options:
Trainable Document Separation (TDS). Select this option to activate trainable document
separation (TDS) for the project. This option is cleared by default.
Higher success rate because of the global scope. For example, if a page looks like it
could be either
o the 1st page of a 3-page document or
o the 1st page of a 2-page document
If 1 looks like it is a higher probability, simple, local separation will classify it as the 3-
page document. If the 3rd page has a high probability of being the 1st page of another
document, option 2 above should have been chosen.
o Even if the first page has a relatively low recognition probability, it can still be
correctly classified and separated with Sophisticated Schema Algorithm
Classification.
o Can reorder pages if mixed up in document preparation
The use of a barcode that is located within the document can also be used to separate a
document. Use of barcode would ensure that separation results are 100%
For unstructure documents, Kofax provide efficient way of learn-by-example approach to understand,
classify and cluster a large amount of documents. It only requires user to provide sufficient samples to
start with. The solution can automatically learn the features of layouts and patterns of the content.
Content or layout?
OCR on demand