Professional Documents
Culture Documents
BatchPurifier 7.6
User Guide
Table of Contents
Introduction…..................................................................................................................................... 3
Using BatchPurifier............................................................................................................................. 5
JPEG 2000 Image .jp2 Metadata (including native metadata, XMP, and
other hidden data)
MP3 Audio .mp3 ID3v1 tag and Lyrics3v2; ID3v2 tag (including
XMP); APE tag
MP4 File .mp4;.m4a; .m4v;.m4 Metadata (including native metadata, XMP, and
b other hidden data)
* Metadata can be removed only from some of the PDF versions, which are the most commonly used. See Appendix A.
Hidden Data Types for more details.
For detailed explanation about the hidden data & metadata types that Digital Confidence
DataDistiller™ Engine, which powers BatchPurifier™, is able to remove, see Appendix A. Hidden
Data Types.
Using BatchPurifier
To remove hidden data from files using BatchPurifier™, follow four simple steps:
1. Select the files to be purified
2. Select the hidden data types to be removed
3. Select metadata to be preserved
4. Specify output options for the purified files
BatchPurifier™ will then inspect the files for the hidden data types that the user chose to remove,
remove them while keeping the rest of the data intact, and save the purified files according to the
output options.
In case the user chose to save the purified files in an output folder, instead of overwriting the
scanned files, and a file with the same name of a scanned file already exist in that folder, the
purified file name will be appended with a number.
In the Hidden Data Filters Selection screen, you can select the hidden data types to be removed
from the chosen files. You can save a selection as a Preset, for easy future re-selection.
Purification Report
When the files purification is finished, a report is presented to the user with a list of Files
Successfully Purified in one tab. In case some files couldn't be purified, they appear in the Files
Couldn't be Purified list in a second tab along with the reason. Possible reasons are:
• Unexpected format – e.g. the file is corrupted, or the file extension doesn't match its true
type
• Unauthorized access – e.g. the file is cannot be written to a particular folder due to lack of
security privileges
• Inaccessible file – e.g. the file is open by another application
• Read-only file – the file is marked as Read-only and you chose to overwrite the input files
with the output files
• Hidden file – the file is marked as Hidden and you chose to overwrite the input files with
the output files
• Encrypted file – the file is encrypted
• Unsupported PDF version – there are several versions of PDF files. BatchPurifier™ can
remove metadata only from some of the PDF versions, which are the most commonly used.
These include PDF files generated by Microsoft Office 2007-2019 and OpenOffice. PDF
files generated by some PDF writing software in the market cannot be cleaned with
BatchPurifier™. In particular, PDF files generated by the latest Adobe software, such as
Adobe Acrobat Pro cannot be cleaned with BatchPurifier™.
• Unsupported AVI version – BatchPurifier™ does not support AVI files which use a feature
called Multipart OpenDML AVI, which is necessary for AVI files larger than 2 GB. (but may
be used for smaller files too)
Configuring BatchPurifier
You can configure BatchPurifier by clicking the Options button on the lower left corner. There are
two Options tabs: General and Advanced.
The General tab lets you configure BatchPurifier to use certain purification options by default.
The Advanced tab lets you include three hidden data filters for JPEG files. These hidden data
generally does not include private information, and removing them may affect the appearance of the
image.
Digital Confidence DataDistiller™ Engine is able to remove metadata & hidden data of the
following types:
Document Properties
Applicable to Microsoft® Word, Microsoft Excel®, Microsoft PowerPoint®, OpenDocument Text, OpenDocument
Spreadsheet, OpenDocument Presentation, and OpenDocument Graphics.
Metadata that includes details such as author name, title, subject, keywords, category, status,
comments, revision number, and total editing time. Document properties may also include user
defined custom properties, a non-standard metadata that can be added to a document.
Comments
Applicable to Microsoft® Word, Microsoft Excel®, Microsoft PowerPoint®, OpenDocument Text, and OpenDocument
Spreadsheet.
Comments that were added to the document. With each comment, the name of the user who added it
and the date and time in which it was added are also saved.
Tracked Changes
Applicable to Microsoft® Word, Microsoft Excel®, OpenDocument Text, and OpenDocument Spreadsheet.
Tracked changes are changes made to the document while the Track Changes option was enabled.
This include inserted, deleted, modified, and moved text. Every change is saved with the name of
the user who made the change, as well as the date and time in which the change occurred. If the
tracked changes are not removed from the document, previous versions of the document can still be
viewed.
Hidden Text
Applicable to Microsoft® Word.
Text can be formatted as hidden so it won't be printed. Hidden text will not appear on the screen as
well unless the application is specifically set to show it.
Slide Notes
Applicable to Microsoft PowerPoint® and OpenDocument Presentation.
Slide notes are notes that were added to the slides for oral presentation and are not visible in the
slides themselves.
Hidden Slides
Applicable to Microsoft PowerPoint®.
Hidden slides are slides that were marked as hidden are not presented in the slide show.
Off-Slide Content
Applicable to Microsoft PowerPoint®.
Off-slide content is content that have been placed outside the slide area and is not presented in the
slide show.
Hidden Worksheets
Applicable to Microsoft Excel®.
Hidden worksheets are worksheets that were marked as hidden. Hidden worksheets will not appear
on screen.
Printer Settings
Applicable to Microsoft® Word, and Microsoft Excel®.
Contains information about a printer or a display device, including its name.
Versions
Applicable to OpenDocument Text, OpenDocument Spreadsheet, OpenDocument Presentation, and OpenDocument
Graphics.
Several versions of the document can be saved by the user in a single file.
PDF Metadata
Applicable to PDF documents.
PDF documents typically contain document information and XMP metadata. In addition, due to
performance considerations, deleted objects are sometimes left in the file and only marked as
deleted. Although this makes them invisible when viewed in a standard PDF reader, it is still
possible to retrieve them from the file.
There are several versions of PDF files. Currently, DataDistiller™ Engine can remove hidden data
only from some of the PDF versions, which are the most commonly used. This includes PDF files
generated by Microsoft Office, OpenOffice, and PDFCreator. PDF files generated by minority of
the PDF writing software in the market today cannot be cleaned with the current version of
DataDistiller™ Engine. In particular, PDF files generated by the latest Adobe software, such as
Adobe Acrobat Pro cannot be cleaned with the current version of DataDistiller™ Engine.
JPEG Metadata
Applicable to JPEG images.
JPEG images may contain the following types of hidden data: EXIF (Exchangeable image file
format), IPTC Information Interchange Model, XMP (Extensible Metadata Platform), comments,
and ICC Profile. JPEG may contain additional non-standard proprietary hidden data.
JPEG metadata are added automatically by digital cameras, scanners, and image processing
software. This metadata often contains information such as the exact date and time the photograph
was taken, the digital camera manufacturer, model, and unique serial number, the camera settings,
and the location (if GPS-enabled camera was used). Furthermore, a thumbnail of the image often
exist in the JPEG file, and many image manipulation software fail to update this thumbnail when
the original image is modified. So even if the image was cropped, or otherwise modified to hide
certain parts in it, the removed parts may still be visible in the thumbnail.
DataDistiller™ Engine can remove metadata from JPEG files without degrading the image quality.
PNG Metadata
Applicable to PNG images.
PNG metadata can contain various details about the image, such as the author, the editing software,
and the time and date in which it was created. The metadata can also be structured within XMP
(Extensible Metadata Platform).
DataDistiller™ Engine can remove metadata from PNG files without degrading the image quality.
SVG Metadata
Applicable to SVG images.
SVG metadata can contain various details about the image, such as the author, the editing software,
and the time and date in which it was created.
DataDistiller™ Engine can remove metadata from SVG files without degrading the image quality.
AVI Metadata
Applicable to AVI video files.
AVI metadata can contain various details about the video, such as the author, the camera and
software used, and the time and date in which it was created.
DataDistiller™ Engine can remove metadata from AVI files without degrading the video quality.
ID3v1 Tag
Applicable to MP3 and WavPack.
ID3v1 tag typically contains information such as title, artist, album, genre, and track number.
ID3v2 Tag
Applicable to MP3, WAVE, and AIFF.
ID3v1 tag typically contains information such as title, artist, album, genre, and track number.
AIFF Metadata
Applicable to AIFF audio files.
AIFF metadata can contain various details about the audio file, such as the author, the software
used, and the time and date in which it was created. The metadata can also be structured in ID3
format.
DataDistiller™ Engine can remove metadata from AIFF files without degrading the sound quality.
MP4 Metadata
Applicable to MP4 files.
MP4 metadata can contain various details about the file author, the software used in its creation, and
the time and date in which it was created. The metadata can also be structured in XMP format.
Removing metadata from MP4 files will not degrade the video or audio qualities.
F4V Metadata
Applicable to F4V files.
F4V metadata can contain various details about the file author, the software used in its creation, and
the time and date in which it was created. The metadata can also be structured in XMP format.
Removing metadata from F4V files will not degrade the media quality.
APE Tag
Applicable to Monkey's Audio, MP3, Musepack, OptimFROG, WavPack, and Tom's Audio Kompressor Audio.
APE tag typically contains information such as title, artist, album, genre, and track number.
XML/XSD/XSL Comments
Applicable to XML/XSD/XSL files.
Comments that were added to the file.
Appendix B. Frequently Asked Questions