You are on page 1of 7

Acquiring and Processing Born-Digital

Materials from Public Officials


Published:
Last Revision:

AcquiringandProcessingBornDigitalMaterialsfromPublicOfficials

About this Guide


This guide is written as an internal document to help archivists at the UALR Center for
Arkansas History and Culture acquire and process born-digital materials.

AcquiringandProcessingBornDigitalMaterialsfromPublicOfficials

How do I prepare for collecting born-digital materials?


If you are acquiring a collection with born-digital materials and are unsure about what
to do, talk to the Director of Technology for help on proper acquisition. The Digital
Services Lab will handle the forensic imaging and processing of born-digital materials.

What the Digital Services Lab needs to know


You may find born-digital materials on a variety of media. When you first encounter a
legislators collection, you will probably see a computer, CD-ROMs, flash drives, floppy
disks, and external hard drives. All of these sources can contain documents, images,
audio clips, and videos. In order to accurately process born-digital materials, the
Digital Services Lab needs to know about the number of stored media, email and
social media accounts, types of files, account usernames and passwords, and
organizational schemes.

Questions for donors


1. What type of documents do you create on your computer?
2. Are you solely responsible for the creation of documents and messages?
3. Do you save images and video to your computer?
4. Do you store files on your computer? Do you have a flash or external hard drive?
5. Do you use more than one computer?
6. Do you have more than one email account?
7. Do you have an organizational scheme for your saved email?
8. Do you have a contacts list?
9. Do you use calendar software on your computer?
10. Do you have a website/blog?
11. Do you have any social network accounts?
12. Do you update the account, or someone else?
13. Do you have any old computers?
14. Do you use any document sharing sites, such as Google Drive or Dropbox?
15. Do you have any unusual/outdated media storage, such as floppy disks or punched
cards?
16. Do you know of any files that contain sensitive content, such as tax or medical
records?
17. Are there any files you want destroyed?
18. Do any digital files or accounts require passwords?
It is difficult to acquire all of the information pertaining to the content and context of
the donors born-digital files. Because of that issue, a short interview with the donor
/1014

AcquiringandProcessingBornDigitalMaterialsfromPublicOfficials

will help us gather context of what is stored. Usually, born-digital objects tend to not
be as organized in the same way as physical objects, so asking the right questions first
will enable you to understand more about the content and context of the files. Many
times, the donor may not really know the types of digital files that are on their
computer. It will also be beneficial to have all of the external media organized
together. Because born-digital objects can easily be disorganized, organizing all of the
physical objects together will ensure that all of the digital objects will be acquired.
The Digital Services Lab has a mobile forensic station that can capture a forensic image
of the hard drive. This process can either be done on-site or in the Center.

Questions to ask yourself about file types


1. Is the file type proprietary to a specific software program? If so, will the Center
need to purchase this piece of software?
2. Is the metadata correct? In the case of photographs, not all users set the correct
time on their cameras, so the metadata of images may not be reliable.
3. Will migration of file formats be required for sustainability?
4. Are the file types susceptible to degradation? In the case of photographs, JPG
files degrade every time they are edited.

Software programs and file types to look for:

.psd: image files that are saved in Adobe Photoshop


.doc/.docx: document files saved in Microsoft Word. Most word processors can read
.doc/docx files, however the formatting will most likely change if not opened in
Microsoft Word
.pdf: document file that does not allow editng
.indd: files created in InDesign
.jpg: lossy image file (susceptible to degredation)
.tiff: lossless image file
.avi: lossless video file
.mp4: lossy video file(susceptible to degredation)

What the donors need to know


The donors will probably have very little knowledge of how to deal with born-digital
materials. The idea of organizing them may seem very daunting. Its important to let

AcquiringandProcessingBornDigitalMaterialsfromPublicOfficials

them know that they do not have to do any organization to their filesit is our job to
capture the image and make sense of what files are on the drive.
They also need to understand that because we are capturing digital images of files,
nothing can be deleted. The image will be a digital representation of the digital work
environment.
Last, they need to understand that we will not have copies of the actual files
themselves, only the digital images captured with the digital forensic software.
Because of this distinction, it will be important to tell them about expectations of
retaining historical context in the archive.
If they change content and share files with others, this will alter the historical accuracy
of the content that is retained at the Center. We cant control what they do with their
own files, but it is important to let them know the implications of editing and sharing
files that have been digitally captured by us.

Preserving email
When preserving email, the Digital Services Lab does not just want to acquire the raw
text of the messages; we want to preserve the data of the email along with the text.
Preserving the data will enable us to digitally categorize and organize contents of the
email. If we just have the text preserved, we cannot effectively organize email
messages.
The two protocols used to store email messages are Internet Message Access Protocol
(IMAP) and Post Office Protocol (POP). We want to use these protocols to download
email data. These protocols are used in widely-used internet email clients such as
Yahoo and Gmail.
For instance, the State of Arkansas uses Microsoft Outlook for their official email
accounts, which is not an internet based system but a software based. Outlook has the
ability to allow users to download .PST data files. After downloading the .PST files, the
Digital Services Lab can use the digital forensic software in the same manner as we
would with other digital files.
Speak with the Director of Technology and Digital Initiatives to understand the official
procedure to acquire email messages.

/1014

AcquiringandProcessingBornDigitalMaterialsfromPublicOfficials

Glossary
BitCuratorAn open-source digital forensic platform designed for proper acquisition of digital files.
Digital images
Images are virtual snapshots of digital files. An image contains the content and metadata of
the file. Images are important because they do not run the risk of the actual digital file being
overwritten, deleted, or edited.
Guymager
A tool designed to create a digital image of a file
Checksum
A process used to see if a file has been altered
Bulk extractor
A tool designed to extract and view specific information from a digital image. You can scan
for information including phone numbers, file types, and email addresses.
Forensic bridge
A USB device designed to block a user from writing content onto a drive.
Fiwalk
A program that processes results of a digital image file into an XML format.
OAISOpen archival information system

Framework for data archiving and management


Structured around three information packages
o Submission information packages (SIPS)
o Archival information packages (AIPS)
o Dissemination information packages (DIPS)
This structure creates the Preservation Description Information (PDI)

OAIS Requirements

Referenc
Provenance
Context
Fixity

VM VirtualBoxAn application designed to run a virtual operating system such as BitCurator.

AcquiringandProcessingBornDigitalMaterialsfromPublicOfficials

Root:
The top-level directory in the Linux file system. The only directories that should be used are
from the home directory.
Base 16 and Base 64:
Encoding schemes that convert binary data into ASCII formats.

Contact Information
If you have any questions, contact Chad Garrett at cxgarrett@ualr.edu

/1014

You might also like