You are on page 1of 2

Implementing FAST Facets in ContentDM

Jody Perkins John Millard


Metadata Librarian Head of Digital Initiatives
Miami University Libraries Miami University Libraries

Overview
This project explores the use of FAST facets for a
ContentDM collection. Miami Stories, a collection of
oral histories, will serve as a test case. A tag set based
on FAST facets will be used to markup transcribed text.
Tagged data will be parsed into ContentDM fields that
mirror FAST facets. Terms will be edited and checked
against FAST authority files.

Miami Stories, a continuing oral history project focused


on Miami University history, was established as a permanent
program of the Miami University Libraries and operates out of the
University Archives through the Department of Special Collections
and Archives.

The purpose of this program is threefold: 1) to initiate professionally-


conducted oral history projects of relevance to the documentation
of University history, 2) to store, preserve and provide access to oral
history recordings and ensure that the content is maintained in an accessible format,
and 3) to coordinate and support independent projects on campus that capture as-
pects of the University’s oral history.

Faceted Application of Subject


Terminology (FAST)
http://www.oclc.org/research/projects/fast/

From the FAST Project page at OCLC

“The Library of Congress Subject Headings


schema (LCSH) is by far the most commonly
used and widely accepted subject vocabulary
for general application.

The purpose of adapting the LCSH with a


simplified syntax to create FAST is to retain
the very rich vocabulary of LCSH while mak-
ing the schema easier to understand, control,
apply, and use.”

This is a complete list of FAST Facets as listed on their website. Not all facets will be relevant for
all collections. FAST is still in beta and authority files for every facet are not yet complete. Appli-
cation guidelines are also still in development. At present our use of FAST facets is an attempt to
standardize subject level access points across collections that can easily be mapped to Dublin Core.
In the future we hope to develop an automated process of checking the collected terms against
FAST.
Facet Identification and ContentDM
Project Workflow Tagging Interface
The digitizing and metadata workflow As staff enter or type the transcript, they are asked to identify
created for the project facilitates words, phrases or concepts that are mentioned in the recording. The
coordination of activities, as shown above, transcript-tagging interface allows for simple identification of words
in two departments at opposite ends of or phrases that are likely candidates for FAST facets. The transcriber
the Miami University Oxford Campus. simply highlights the word or phrase and presses the corresponding
FAST facet button. A pair of simple XML-like tags are inserted around
the selected text.

The editing interface is


built using the open source
JavaScript Quicktags
library developed by Alex
King (http://alexking.org/
projects/js-quicktags)
with the HTML tagset
replaced with custom tags
representing the FAST
facets.

Once a text is transcribed,


After interviews are videotaped, project
the tagged facets can be
staff enter basic metadata into an online
extracted and become a
object tracking and metadata system. The
source of semantic data for
system generates a printed transmission
the metadata cataloger to
receipt which is then attached to the
use in the construction of
original recording and sent to Digital
valid FAST headings.
Initiatives for processing and digital
conversion.

After digital processing, project staff


create transcripts of the recordings which
are then added to the tracking system
record created earlier.
Extracted facets and possible expanded terms based on context

FAST Facets in a ContentDM Schema

Metadata will be gathered from a variety of sources - the digital


services submission form, a marked-up transcript, automated
processes, as part of a batch process or via direct keying by a metadata
editor. Terms in the FAST designated fields will come directly from
a tagged transcript. As a final step the parsed terms will be checked
against the appropriate FAST authority file - ideally as part of an
automated process. Terms not found in FAST will be parsed into
separate keyword fields, retaining the same facets, and may be used
later in developing local controlled vocabularies.

Facets are implemented in ContentDM as individual fields in the


collection schema as illustrated below. The example record to the
right illustrates potential sources of field values.