You are on page 1of 29

Glossary

abbreviation empires, nations, states, districts, and


A shortened form of a name or term (e.g., townships. See also physical feature.
Mr. for Mister). See also acronym and
algorithm
initialism.
In the context of this book, an algorithm
access point is a procedure, a formula, or the rules in a
An entry point to a systematic arrange- computer program or set of programs, often
ment of information, specifically an expressed in algebraic notation, that follow a
indexed field or heading in a work record, logical, unambiguous step-by-step process
a vocabulary record, or another content to retrieve a set of results, solve a problem,
object that is formatted and indexed in make a decision, manipulate or alter data, or
order to provide access to the information achieve some other result or state. Although
in the record. a computer program may be considered
one large algorithm, in common usage in
acronym
computer science, the term typically refers
An abbreviation or word formed from the
to a small procedure applied recurrently.
initial letters of a compound term or phrase
See also computer program.
(e.g., MoMA, for Museum of Modern Art).
See also abbreviation and initialism. alphanumeric classification scheme
A set of controlled codes (letters or
ad hoc query
numbers or both) that represent concepts
Also called a direct query. A query or
or headings and generally have an implied
report that is constructed when required
taxonomy that can be surmised from the
and that directly accesses data files and
codes (e.g., the Dewey Decimal Classifica-
fields that are selected only when the query
tion system number 735.942). See also
is created. It differs from a predefined
chain indexing.
report or querying a database through a
user interface. alternate descriptor (AD)
A variant form of a descriptor available for
administrative data
use; usually a singular form or a different
In the context of cataloging art, informa-
part of speech than the descriptor (e.g.,
tion having to do with the administrative
lithograph is an alternate descriptor for the
history and care of the work and the history
plural descriptor lithographs). In thesauri,
of the catalog record (e.g., insurance value,
the relationship indicator for this type of
conservation history, and revision history
term is AD.
of the catalog record). See also descrip-
tive data. ancestor
In a hierarchy, any record that is a broader
administrative entity
context for the record at hand, including
In the context of a geographic vocabulary,
parents, grandparents, and all other broader
a political or other administrative body
contexts at higher levels; any node in the
defined by administrative boundaries and
succession of parent nodes on a path all the
conditions, including inhabited places,
way up to the root. See also descendant.

210
Glossary 211

antonym by museums. Performance art is also


A term that is the opposite in meaning included, but the performing arts are not.
of another term (e.g., roughness is an Note that these are works of visual art of
antonym for smoothness). the type collected by art museums. The
objects themselves may actually be held by
application
an ethnographic, anthropological, or other
Also called an application program. A
museum, or owned by a private collector.
software program designed to accomplish a
task for an end user (e.g., word processing artist
or project management), as distinguished Any person or group of people involved in
from the operating system program that the design or production of visual arts that
runs the computer itself. are of the type collected by art museums.
application programming ascending order
interface (API) In the context of a string of hierarchical
In the context of this book, an online parents, refers to the display of parents
system, source code, and interface that a from narrowest to broadest (e.g., Columbus
data provider (e.g., a vocabulary provider (Bartholomew county, Indiana, United
or library) employs to allow users to have States) ). See also descending order.
access to the data. It may be language
ASCII
dependent (designed for a specific
Acronym for the American Standard Code
programming language) or language
for Information Interchange, a 7-bit char-
independent (works with multiple pro-
acter code defining 128 characters used for
gramming languages).
information interchange, data processing,
architect and communications systems.
A person or firm involved in the design or
associative relationship
creation of structures or parts of structures
In a thesaurus, the relationship between
that are the result of conscious construc-
concepts that are closely related conceptu-
tion, are of practical use, are relatively
ally, but the relationship is not hierarchical
stable and permanent, and are of a size
because it is not whole/part or genus/
and scale appropriate for—but not limited
species. The relationship indicator for this
to—habitable buildings.
relationship is RT (for related term).
architectural work See also equivalence relationship and
See built work. hierarchical relationship.
architecture asymmetric relationship
Refers to the built environment that is In the context of a thesaurus, refers to a
typically classified as fine art, meaning it reciprocal relationship that is different
is generally considered to have aesthetic in one direction than it is in the reverse
value, was designed by an architect, and direction—for example, BT/NT (for
was constructed with skilled labor. See also broader term/narrower term). See also
built work. symmetric relationship.
archival group authoritative source
See group. A published source that is based on reliable
documentary evidence that is accepted as
art
true by most experts and used as a standard
In the context of this book, refers to the
source in a given discipline.
visual arts such as painting, sculpture,
drawing, printmaking, photography, authority file
ceramics, textiles, and decorative arts of Also called simply an authority. A file,
the type and caliber generally collected typically electronic, that serves as a source
212 Introduction to Controlled Vocabularies

of standardized forms of names, terms, of data processing, typically accomplished


titles, etc. Authority files should include by the computer without user interaction,
references or links from variant forms as contrasted to entering records manu-
to preferred forms. The main purpose of ally, one at a time. See also load and
an authority is to enforce usage, often processing.
requiring users to use only the preferred
batch processing
term for a given concept. Any type of
See processing.
vocabulary can be used as an authority. See
also controlled vocabulary and local best match
authority. Also called a weighted term ranking.
Refers to a variety of electronic term-
authority heading
matching and ranking methods that attempt
A preferred, authorized heading used in a
to predict the potential relevance of query
vocabulary, particularly in a bibliographic
results by assigning relevance scores and
authority file that typically includes a
ranking based on comparing search terms
string of names or terms, with additional
to the indexing terms of the target database.
information as necessary to allow disam-
See also exact match.
biguation between identical headings
(e.g., United States—History—Civil blind reference
War, 1861–1865—Battlefields and In the context of a vocabulary that is being
United States—History—Civil War, used for indexing or retrieval on a defined
1861–1865—Campaigns ). The types of data set, refers to a term in the vocabulary
authority headings used by the Library of that is not linked to any content in the data
Congress are the following: subject, name, set. End users should typically not receive
title, name/title, and keyword authority blind references in a retrieval situation
headings. See also heading. because they result in a failed search;
however, these terms should be retained in
authorization
structured vocabularies that are used for
In the context of vocabularies, the process
indexing because they may be needed in the
by which the creators of a vocabulary or an
future or in another context.
oversight group regulate the selection of
terms and establishment of relationships in Boolean operators
a controlled vocabulary. See also warrant. Logical operators used as modifiers to
refine the relationship between terms in
automatic indexing
a search. The four most commonly used
In the context of online retrieval, indexing
Boolean operators are AND, OR, NOT, and
by the analysis of text or other content using
ADJ (adjacent). They may be used with
computer algorithms. The focus is on auto-
parentheses and other punctuation to form
matic methods used behind the scenes with
logical groupings of criteria in queries (e.g.,
little or no input from individual searchers,
(Castillo OR Rancho) AND Diego).
with the exception of relevance feedback.
The results tend to be broad and imprecise, bound term
as contrasted to human indexing. See also A compound term representing a single
co-occurrence mapping. concept, characterized by the fact that the
words almost always occur together and
autoposting
the meaning is lost or altered if the term is
See up-posting.
split into its component words. See also
batch load compound term and lexical unit.
In the context of populating or contributing
brand name
to vocabulary systems or other databases,
A trade or proprietary name for a thing or
refers to moving or manipulating a group
process (e.g., Super Glue).
of records as a single unit for the purpose
Glossary 213

broadcast searching cataloger


See federated searching. In the context of this book, the person who
records information in records for works.
broaden results
See also end user and indexer.
To adjust criteria in a search in order to
retrieve a larger number of results, typically cataloging
because the searcher did not find what he In the context of this book, the process of
or she wanted in an initial narrower search. describing and indexing a work or image,
See also narrow results. particularly in a collections management
system or other automated system. Cata-
broader term (BT)
loging involves the use of prescribed fields
Also called a broader context. A
of information and rules (e.g., the rules
vocabulary record to which another record
described in CCO and CDWA ).
or multiple records are subordinate in a
hierarchy. In thesauri, the relationship indi- cataloging rules
cator for this type of term is BT. Variations See editorial rules.
on the notation include BTG (broader term
cataloging tool
generic), BTP (broader term partitive), BTI
A system that focuses on content descrip-
(broader term instance), BT1 (broader term
tion and labeling output (e.g., wall labels or
level 1), BT2 (broader term level 2), etc.
slide labels), often part of a more complex
browsing collection management system.
The process whereby a user of a system
chain indexing
or Web site visually scans and maneuvers
Also called chain procedure. A technique
through navigation lists, results lists, hier-
for indexing that uses a numeric or alpha-
archical displays, or other content in order
numeric classification scheme—for
to make a selection, as contrasted to the
example, the Dewey Decimal Classification
user entering a search term in a search box.
system—where the entries have meaning
See also searching.
beyond simple numeric sequencing (e.g.,
built work in Dewey number 735.942, 735 means
An instance of architecture, which includes sculpture after the year 1400 ce, 9 means
structures or parts of structures that are geographic area, 4 means Europe, and 2
the result of conscious construction, are means England ).
of practical use, are relatively stable and
child
permanent, and are of a size and scale
See narrower term.
appropriate for—but not limited to—habit-
able buildings. Built works in the context classification
of art information are manifestations of the In the context of this book, the process
built environment typically classified as fine of arranging works or other content
art, meaning it is generally considered to objects systematically in groups or
have aesthetic value, was designed by an categories of shared similarity according
architect (whether or not his or her name is to established criteria and using terms to
known), and constructed with skilled labor. identify the classes.
See also architecture and movable
classification notation
work.
In a vocabulary, a numeric, alphabetic, or
candidate term alphanumeric code in a system of codes
Also known as a provisional term. A term used to classify or categorize entries; may
under consideration for admission into a be used in a hierarchical arrangement to
controlled vocabulary because of its poten- impose a display or sorting order on the
tial usefulness. See also contribution. lines or levels in the hierarchy (e.g., V,
V.PC, V.PE). See also notation.
214 Introduction to Controlled Vocabularies

classified display buttresses). See also bound term,


See hierarchical display. complex term, and lexical unit.
clustering computer code
In the context of automated data, usually Also called code. The machine-readable
refers to the process of grouping or form, arrangement of data, and instruc-
­classifying items or data through automatic or tions of a computer program that are
algorithmic means rather than incorporating created when a computer program, which
human judgment. was written by a human programmer, is
converted into binary code that can be read
code
by a computer.
See computer code.
computer program
collection
Also called a program. A specific set of
In the context of cataloging art, refers
instructions for ordered operations that
to multiple works that are physically or
result in the completion of a task by the
conceptually arranged together, including
computer; a computer program consists
the entire set of objects curated by a given
of computer code. While the program
museum or other repository.
is technically a type of data, computer
collection management system programs are generally considered as
(CMS) separate from the data to which the
A type of database system that allows an programs refer (e.g., data would be the
institution to control various aspects of its terms, scope notes, etc., in a vocabulary
collections, including description (artist, record). A program is interactive if it acts
title, measurements, media, style, subject, when prompted by an action or information
etc.) as well as administrative information supplied by a user, or batch if it automati-
regarding acquisitions, loans, and conser- cally runs at a certain time or under certain
vation information. conditions and then stops after the task
is completed. A program is written in a
complex term
programming language. See also
A single phrase denoting more than two
processing.
distinct concepts, which could be broken
out and used independently, as defined by computer system
the Library of Congress. See also bound See system.
term, compound term, and heading.
concept
component In the context of the AAT and other thesauri
In the context of cataloging art and archi- comprising generic terms, the subject of
tecture, a part of a larger item. A component the vocabulary record (i.e., the concept to
differs from an item in that an item can which the terms refer), including abstract
stand alone as an independent work, but concepts; physical attributes such as shape,
a component typically cannot or does not pattern, and color; style or period; activities;
stand alone (e.g., a panel of a polyptych terms for performers of activities; materials;
or a façade of a basilica). See also group objects; and visual and verbal communica-
and item. tion forms. See also discrete concept.
compound term concept record
A term consisting of two or more words. See record.
In the context of this book, mention of
conceptual data model
compound terms generally refers to bound
An abstract model or representation of data
terms, which are compound terms that
for a particular domain, business enter-
represent a single concept (e.g., flying
prise, field of study, etc., independent of
Glossary 215

any specific software or information system; to retrieve content through browsing or


usually expressed in terms of entities and searching. A controlled vocabulary typi-
relationships. See also logical data cally includes preferred and variant terms
model. and has a limited scope or describes a
specific domain.
content object
In the context of a database, any entity that co-occurrence mapping
contains data. A content object can itself be Also called co-occurrence clustering.
made up of content objects. For example, An automated method of compiling groups
a journal is a content object made up of of terms that tend to occur together in
individual journal articles, which are them- certain contexts and are therefore presumed
selves content objects. See also informa- to be related in some way; the resulting
tion object. groups of terms are considered to be
loosely related and may be used to auto-
contribution
matically broaden a user’s search or to
In the context of controlled vocabularies, a
suggest alternative search terms to users in
term or record that is submitted for admis-
order to improve search results. See also
sion into a thesaurus or other vocabulary by
automatic indexing.
an agency or individual outside the group
responsible for maintaining the vocabulary; core fields
contributions are typically made by users of Also called core elements. In the context
the vocabulary. See also candidate term. of this book, the set of fields representing
the fundamental or most important informa-
controlled field
tion required for a minimal record, whether
In the context of this book, a field in a
the record is a work record or a vocabulary
record that is not free text, meaning it is
record. See also required fields.
specially formatted and often linked to
controlled vocabularies (authorities) or corporate body
controlled lists to allow for successful In the context of vocabularies discussed
retrieval. See also free-text field. in this book, an organized, identifiable
group of individuals working together in a
controlled format
particular place and within a defined period
Rules applied to the field regarding the
of time, whether or not they are legally
types of values that may be included (e.g., a
incorporated (e.g., architectural firms, artist
controlled measurement’s value field would
studios, and art repositories).
allow only numbers). Fields may have
controlled format in addition to being linked criteria
to controlled vocabulary, or the controlled In the context of this book, a specific set of
format may exist in the absence of any finite limiting conditions used to create a query
controlled list of valid values. or select a subset of entries (e.g., a WHERE
statement in SQL). See also variable.
controlled list
A simple list of terms used to control termi- cross-database searching
nology. In a well-constructed controlled list, See federated searching.
terms should be unique, members of the
cross-reference links
same class, not overlapping in meaning,
See syndetic structure.
equal in granularity/specificity, and
arranged alphabetically or in another logical crosswalk
order. A type of controlled vocabulary. A chart or table (visual or virtual) that repre-
sents the semantic or technical mapping
controlled vocabulary
of fields or data elements in one database,
An organized arrangement of words and
metadata framework, standard, or schema
phrases used to index content and/or
to fields or data elements that have a
216 Introduction to Controlled Vocabularies

similar function or meaning in one or more would be fields included in a vocabulary


other databases, frameworks, standards, record). See also field.
or schemas (e.g., the artist element in one
database index
standard may map to the creator element in
Also called a data index. A particular type
another). See also mapping.
of data structure that improves the speed of
cultural heritage operations in a table by allowing the quick
The total corpus of activities and the arti- location of particular records based on key
facts of activities that provide a record of column values. Indexes are essential for
the life of a culture. See also material good database performance. The concept
culture. is distinguished from indexing (human
indexing) and automatic indexing.
cultural works
In the context of this book, art and archi- database normalization
tectural works and other artifacts of cultural See normalization.
significance, including both physical
database record
objects and performance art. In related
See record.
disciplines, the scope could be broader,
also including the performing arts. data content
The organization and formatting of the
data
words or terms that form data values.
In common usage in computer science, this
term is used as a singular noun to refer to data elements
information that exists in a form that may be The specific categories or types of infor-
used by a computer, excluding the program mation that are collected and aggregated
code. In other uses, datum is the singular in a database.
and data is the plural, referring to facts or
data preprocessing
numbers in a general sense.
See preprocessing.
database
data processing
A structured set of data held in computer
See processing.
storage, especially one that incorporates
software to make it accessible in a variety data structure
of ways. A database is used to store, A given organization of data, particularly
query, and retrieve information. It typically the data elements, the logical relationships
comprises a logical collection of inter- between data elements, and the storage
related information that is managed as a allocations for the data.
unit, stored in machine-readable form, and
data table
organized and structured as records that
Sets of data that are organized in a grid or
are presented in a standardized format in
matrix comprising rows and columns.
order to allow rapid search and retrieval by
a computer. See also system. data values
In the context of this book, the terms,
database field
words, or numbers used to populate fields
Also called a data field. A placeholder for
in a work or vocabulary record. See also
a set of one or more adjacent characters
data content.
comprising a unit of information in a data-
base, forming one of the searchable items decoordination
in that database. It is a portion of a struc- In the context of a thesaurus, the splitting
tured record, especially a machine-readable of a compound term into its component
record, containing a particular category words to stand as individual terms. This
of information (e.g., term and scope note would typically happen if a compound term
Glossary 217

had been added to the thesaurus but was use as a default in displays. In thesauri,
later determined not to be a bound term. the relationship indicator for this type of
term is D.
deep Web
See hidden Web. diacritics
Also called diacritical marks. Signs or
derivation
accent marks found over, under, or through
Also called modeling. In the context of
alphabetic letters in many languages (e.g.,
this book, the process of building a new
the umlaut in German, München), used to
vocabulary based on an existing vocabulary.
indicate emphasis or pronunciation, often
In this approach, an appropriate controlled
to distinguish different sounds or values
vocabulary is selected as a model for devel-
of the same letter or character without the
oping controlled terminology for local use,
diacritical mark.
so that the local terms will be interoperable
with the larger original vocabulary. See also digital asset management system
local authority and microcontrolled (DAMS)
vocabulary. A type of system for organizing digital
media assets, such as digital images or
descendant
video clips, for storage and retrieval. Digital
Also often spelled descendent in the
asset management systems sometimes
disciplines of computer science and
incorporate a descriptive data cataloging
thesaurus construction. In a hierarchy, any
component, but they tend to focus on
record that is a narrower context for the
managing workflow for creating digital
record at hand, including children, grand-
assets and for managing asset rights,
children, and all other narrower contexts at
requests, and permissions.
all lower levels; any node in the succession
of parent nodes on a path all the way down direct mapping
to the tips (leaves) of the hierarchies. See In the context of interoperability of vocabu-
also ancestor. laries, refers to the matching of terms
one-to-one in two controlled vocabularies.
descending order
While the vocabularies need not be the
In the context of a string of hierarchical
same size or cover exactly the same
parents, the display of parents from
content, where overlap exists, there should
broadest to narrowest (e.g., Columbus
be the same meaning and level of specificity
(United States, Indiana, Bartholomew
between the two terms in each controlled
county) ). See also ascending order.
vocabulary. See also switching.
descriptive data
direct query
In the context of cataloging art, data
See ad hoc query.
intended to describe and identify a work,
as contrasted to information necessary for disambiguation
administrative, technical, or accounting In the context of creating and displaying
purposes. See also administrative data. a vocabulary, the use of qualifiers, head-
ings, or other methods to clarify and
descriptor (D)
remove ambiguity between homographs
In a thesaurus, the term recommended
(e.g., Smith, John (English printmaker,
to represent the concept in displays
1654–1742) and Smith, John (English
and indexing. Also called the main term,
architect, 1781–1852)). See also word
postable term, or preferred term in
sense disambiguation.
a monolingual thesaurus. A multilingual
thesaurus may have multiple descriptors discrete concept
(one in each language represented) but may In the context of a generic concept vocabu-
possibly have only one preferred term for lary, a discrete thing or idea as opposed to
218 Introduction to Controlled Vocabularies

a subject heading, which often concatenates rules for catalogers of works are called
multiple terms or concepts together in a cataloging rules.
string. See also concept.
end user
displayed index In the context of this book, usually the
An index that is visible and available to searcher, client, or patron who retrieves,
end users for browsing. See also nondis- views, and uses the data in a vocabulary
played index. or work record, as distinguished from the
editors or catalogers. In the context of
display field
systems design, the term refers to any client
In the context of this book, a field intended
for whom a database system is designed and
for viewing by the end user, typically
used; from that perspective, it could include
showing data in natural language that is
the editors or catalogers for whom an edito-
easily read and understood and that can
rial or cataloging system has been designed.
convey nuance and ambiguity. Display
information may, in some cases, be concat- end-user thesaurus
enated from controlled fields; in other A thesaurus designed for direct access by
cases, this information is best recorded in searchers rather than for use by indexers.
free-text fields. See also indexing. Instead of controlling the terminology, the
purpose of an end-user thesaurus is to
document
help searchers find useful terminology for
In the context of search and retrieval, the
improving, narrowing, and broadening their
combination of a defined, primarily self-
queries. See also indexer thesaurus.
contained, machine-readable text or other
information and the format in which it entity
is housed. In the context of computer science, a self-
contained piece of data that can be refer-
dominant language
enced as a unit. In a more general sense, the
In the context of multilingual vocabularies,
term is used in this book to refer to a distinct
the more prominent or original language
person, place, or concept in a vocabulary.
to which terms in other languages are
mapped and in which other fields in the entity-relationship model
record (e.g., scope notes or date notes) are A type of conceptual data model that repre-
written. In a purely multilingual vocabu- sents structured data in terms of entities
lary, no language is dominant, but in a rich and relationships. An entity-relationship
and complex vocabulary (e.g., the AAT ), diagram can be used to visually represent
a dominant language may be required for information objects and their relationships.
practical purposes. Because the constructs used in the entity-
relationship model can easily be trans-
download
formed into relational tables, this type of
See load.
model is often used in database design.
editorial rules
entry array
In the context of this book, written rules
A type of display, often used for headings,
and guidelines for creators or editors of
in which any two or more entries that have
vocabulary records that dictate how to
the same broader heading (e.g., Religious
populate fields and choose or interpret
art—Ancient Egyptian, Religious art—
data. They should include which fields are
Christian, Religious art—Hindu, etc.)
required, how to choose appropriate values
are grouped together vertically under the
for various fields (e.g., how to choose a
broader heading. While this is not a true
preferred term), how to choose hierarchical
hierarchical display, it may resemble a hier-
positions, the format and syntax for each
archical display through use of indentation.
field, authorized sources, etc. Analogous
Glossary 219

equivalence relationship extension vocabulary


In a thesaurus, the relationship between A thesaurus that is created with the inten-
synonymous terms or names for the tion of, or is later adapted for, linking to
same concept, typically distinguishing another vocabulary that is larger, broader,
preferred terms (descriptors) and or more generic; the extension vocabulary
nonpreferred terms (variants or UFs). is typically linked through node linking,
See also associative relationship and rather than being integrated at many points
hierarchical relationship. in the original vocabulary. See also micro-
controlled vocabulary, node linking,
equivalent term
and satellite vocabulary.
A term that is considered an equivalent in
search-and-retrieval, including not only true external node
synonyms but possibly also near-synonyms See leaf node.
and any other terms that are considered
facet
closely enough related to be useful in
Also called a faceted display. A funda-
broadening a query; to narrow a query,
mental, homogeneous, and mutually exclu-
exact equivalents could be used instead.
sive category of information in a thesaurus
exact equivalence (e.g., the AAT has seven facets: Associated
The relationship between synonyms in one Concepts, Physical Attributes, Styles and
language and terms in different languages Periods, Agents, Materials, Activities,
that have the same usage and meaning. and Objects).
See also inexact equivalence and
facet indicator
nonequivalence.
A node label that designates a facet.
exact match
false hit
Electronic term-matching that produces
Also called a false drop. In search and
a result that precisely matches the user’s
retrieval, an entry in a list of results that does
query term and does not implement
not comply with the user’s intended results.
automatic Boolean operators, truncation,
proximity ranges, or stemming. In a strictly federated searching
applied exact match, normalization is not Also called broadcast searching, cross-
used, so that differences in punctuation, database searching, metasearching,
spacing, and diacritics are maintained in and parallel searching. Performing
the match. See also best match. queries simultaneously across resources
that are in different domains and created by
exhaustivity
different communities. Federated searching
In the context of cataloging and indexing,
may involve searching across multiple
the degree of depth and breadth that the
databases, different platforms, and varying
cataloger uses in assigning indexing
protocols, thus requiring the application
terms or writing a description. Measures
of interoperability between resources
of greater exhaustivity include the use of a
and vocabularies.
greater number of optional fields and the
assignment of a greater number of indexing field
terms for each field. See also specificity. In the context of this book, an area (often
mapping to a metadata element in a meta-
expansion
data element set) in the user interface of a
See query expansion.
system where a discrete unit of information
explode a hierarchy is displayed or the cataloger can enter
To retrieve and display all the descen- information. Note: In this context, field
dants of any given node, typically in a is not necessarily equivalent to a
graphic display. database field.
220 Introduction to Controlled Vocabularies

filing rules following standard protocols and using


A set of guidelines that determine how standardized controlled vocabularies.
letters, numbers, spaces, and special
format
characters should be processed when
Used in two senses in this book. In the
assembling an alphabetical or other listing.
context of cataloging art, the configuration
See also sorting.
of a work—including technical formats—or
first name the conventional designation for the dimen-
Also called a given name. In Western sions or proportion of a work (e.g., cabinet
tradition, the name of a person that identi- photograph or IMAX ). In the context of
fies that individual, typically unique in the computer science, the physical layout of a
immediate family and used with a last name data storage device or the logical structure
(e.g., Richard in Richard Meier ). See also or composition of a file.
last name and middle name.
format control
flat-file database See controlled format.
A database with a data model designed
free-text field
around a single table, often a single file
A field that may contain data entered
containing many records that all have
without any vocabulary control or system-
exactly the same fields. It is a simpler
defined structure. It may be used to express
model than the more highly structured rela-
ambiguity, uncertainty, and nuance in a
tional and object-oriented models.
note. See also controlled field and text.
flat format
generic concept
In the context of a thesaurus, an alpha-
In the context of this book, a concept in a
betical display in which only one level of
vocabulary that is described by terms other
broader contexts and one level of narrower
than proper nouns or names (e.g., the type
contexts are displayed for each focus
of artwork, such as amphora, or a material,
record. See also generic structure.
such as terracotta). Generic concepts do not
focus include proper names of persons, organiza-
Also known as a head noun for terms tions, geographic places, named subjects,
and a trunk name for proper names. or named events.
In the context of a compound term, the
generic posting
noun component that identifies the class
In controlled vocabularies, the use of
of concepts to which the term as a whole
narrower terms as used for terms for a
refers (e.g., buttresses in the term flying
descriptor that is really a broader term in
buttresses). In the context of a modified
the same vocabulary record. A generic
name such as a place name, the part of the
posting is typically used as a time-saving
name that is not a modifier (e.g., Etna in
strategy rather than making separate
Mount Etna). See also modifier.
records for all the terms and linking them
folksonomy hierarchically. See also up-posting.
A neologism referring to an assemblage
generic structure
of concepts, which are represented by
A display format for a thesaurus in which all
terms and names (called tags) that are
hierarchical levels are displayed by using
compiled through social tagging, gener-
indentation, codes, or punctuation marks.
ally on the Web. A folksonomy differs from
See also flat format.
a taxonomy in that it is not structured
hierarchically, and the authors of the folk- genus/species relationship
sonomy are typically the casual users of the Also called a generic relationship.
content rather than professional indexers A hierarchical relationship in which all
Glossary 221

children must be a kind of, type of, or mani- such as disks, disk drives, chips, electronic
festation of the parent. The genus/species circuitry, keyboards, monitors, modems,
relationship is the most common hierar- and printers. See also software.
chical relationship in thesauri and taxono-
harmonization
mies, because it is applicable to a wide
In the context of vocabularies and stan-
range of topics. See also instance rela-
dards, the process of preventing, mini-
tionship and whole/part relationship.
mizing, or eliminating technical and content
given name differences and contradictions between
See first name. standards or vocabularies that have the
same or similar scope or that must work
gloss
interchangeably or in concert.
See qualifier.
heading
grandparent
Also called a label. A string of words
In a thesaurus, the level immediately
comprising a term combined with other
above the parent of the focus record (e.g.,
information that serves to modify, disam-
in the following example, Indiana is the
biguate, amplify, or create a context for the
grandparent of Columbus: Columbus,
main term in displays. Examples include the
Bartholomew county, Indiana, United
listing of qualifiers and/or broader contexts
States ).
for terms (e.g., rhyta (<vessels for serving
granularity and consuming food>, containers) ), place
See specificity. types and administrative broader contexts
for place names (e.g., Dayr al-Bahri
group
(deserted settlement) (Qinaˉ governorate,
Also called an archival group or record
Egypt) ), or biographical information for
group. In the context of cataloging works,
people’s names (e.g., Francesco Aliunno
refers to an aggregate of items that share a
(Italian calligrapher, active 15th century) ).
common provenance. See also component
See also authority heading, name
and item.
authority, and subject heading list.
group-level cataloging
head noun
Describing and assigning indexing terms
See focus.
for a group of works as a whole, typically
focusing on the most important or most hidden Web
frequently occurring characteristics in the Also called the deep Web or invisible
items of the group. See also item-level Web. The sum of the Web pages that
cataloging. are not accessible to Web crawlers or
robots, usually because they are either
guide term
dynamically generated by a user querying
A node label that is not a facet, but is
a database or are password protected or
created as a hierarchical level to provide
subscription based.
order and structure to thesauri by grouping
narrower terms according to a given logic. hierarchical display
Guide terms are not used for indexing and Also called a classified display or
are often enclosed in angled brackets or systematic display. In a thesaurus, a
otherwise distinguished from other terms in graphic arrangement of terms showing
displays (e.g., <photographs by form>). broader/narrower relationships through the
use of indentation, codes, or another method.
hardware
The physical components of a computer hierarchical relationship
system, including those that are mechan- The broader and narrower (parent/
ical, electronic, magnetic, and electrical child) relationship between two entities
222 Introduction to Controlled Vocabularies

in a thesaurus, namely whole/part (e.g., of the document or to other documents.


Montréal is part of Québec), genus/species It is usually indicated by color or other
(e.g., bronze is a type of metal ), or instance emphasis applied to a word, phrase, icon,
relationships (e.g., Montréal is an instance or symbol.
of a city ). It is the basic structure that
hypertext database
creates a hierarchy.
A dataset that resides as a collection of
hierarchy online documents with links joining various
An organization of records related by levels parts to each other, with access provided via
of superordination and subordination. Each an interactive browser.
record in the hierarchy, except the root, is a
Hypertext Markup Language (HTML)
narrower context of the record above it. See
A markup language used to create the layout
also monohierarchy, polyhierarchy,
and presentation of documents for World
and subfacet.
Wide Web applications.
historical term
image
Also called a historical name. In the
In the context of cataloging art, a visual
context of the vocabularies discussed in
representation of a work, usually existing
this book, a term or name that was used to
in a photomechanical, photographic, or
refer to a person, place, subject, or concept
digital format. In a typical visual resources
in the past, but in current usage has been
collection, an image is a slide, photograph,
replaced with a different term or name (e.g.,
or digital file.
historical names for St. Petersburg, Russia,
are Leningrad and Petrograd ). indentation
Also called indention. In the context of
hits
printing or other displays of typed words
See results list.
or texts, refers to the white or blank space
homograph of a fixed width on a row along the right or
A term that is spelled the same as another left margin of a display, as commonly used
term, but the meanings of the terms are to indicate the first line in a new paragraph
different (e.g., drums can have at least of text. Graduated indentation is used to
three meanings: components of columns, indicate relationships between parents and
membranophones, or walls that support a their descendants in hierarchical displays
dome). Homographs exist whether or not of thesauri.
the terms are pronounced alike. Terms are
indexer
generally considered homographs despite
A person who assigns indexing terms for a
differences in capitalization, punctuation, or
work or image, typically the same person as
diacritics. See also qualifier.
the cataloger. See also cataloger.
homophone
indexer thesaurus
A term that is pronounced like another
A thesaurus designed to control termi-
term but spelled differently (e.g., bows and
nology and guide indexers in the choice of
boughs). Homophones are not typically
terms. See also end-user thesaurus.
labeled in traditional controlled vocabularies.
indexing
human indexing
Also called human indexing and
See indexing.
manual indexing. In the context of this
hyperlink book, the process of evaluating informa-
Also called a hypertext link. In the tion and designating indexing terms by
context of online information, an embedded using controlled vocabulary that aids in
link that connects different parts of an finding and accessing the cultural work
online document or data set to other parts record. Refers to indexing done by human
Glossary 223

labor, not to the automatic parsing of internal node


data into a database index (automatic See nonleaf node.
indexing), which is used by a system to
interoperability
speed up search and retrieval.
In the context of controlled vocabularies,
inexact equivalence the ability of two or more vocabularies
The relationship between synonyms in one and their systems or components of their
language or terms in different languages systems to map to each other’s data, with
that have similar or overlapping meaning the goal of exchanging information or
and usage but are not true synonyms (e.g., enhancing discovery.
floating and flying). See also exact equiv-
inverse document frequency (IDF)
alence, nonequivalence, and partial
An automatic ranking method often used in
equivalence.
a formula with term frequency in infor-
information object mation retrieval and text mining to estimate
A digital unit or group of units, regard- how important a term is to a set of data and
less of type or format, that a computer can how useful it will be in retrieval.
address or manipulate as a single discrete
inverted form
object. See also content object.
Also called an inverted index. In the
information processing context of a controlled vocabulary, the
See processing. indexing form of a multiple-word name
or term, where the last name or trunk
information retrieval database
portion of the term is listed first, followed
Also called an IR database. Any data-
by a comma and the descriptive word
base designed primarily for discovering
(e.g., Wren, Christopher, or buttresses,
and retrieving information. The systems
flying). See also natural order form and
that work with IR databases provide the
permuted index.
following: a search interface to permit users
to compose queries, methods for searching invisible Web
through the target data, viewable or behind- See hidden Web.
the-scenes indexes, and results displays.
ISO (International Organization for
initialism Standardization)
A set of initials that stand for the full form A worldwide voluntary, nontreaty network
of a name (e.g., MFA, for Museum of of national standards institutes of approxi-
Fine Arts). See also abbreviation mately 160 countries. The standards bodies
and acronym. work in partnership with international orga-
nizations, governments, industry, business,
instance relationship
and consumer representatives to reach
A hierarchical relationship in which all
consensus, set standards, and promote
children must be an example of a broader
their use with the goal of facilitating trade
context, most commonly seen in vocabu-
and meeting the broader needs of society.
laries where proper names are organized by
general categories of things or events (e.g., item
if the proper names of mountains and rivers In the context of cataloging art, an indi-
are organized under the general categories vidual object or work. See also component
mountains and rivers). See also genus/ and group.
species relationship and whole/
item-level cataloging
part relationship.
Describing and assigning indexing terms
interactive processing for individual items in a collection of works.
See processing. See also group-level cataloging.
224 Introduction to Controlled Vocabularies

jargon partially address the problem of the variety


A characteristic terminology of a particular of terms that can be used to express
group or discipline that is typically not similar concepts.
understood by a more general audience.
Latin 1
keyword A character set (consisting of 191 charac-
In the context of vocabularies, a verbal ters) that is part of a series of ASCII-based
unit or word of a term that may be used in a character encodings defined in ISO/IEC
search expression (e.g., for the place name 8859-1:1998: 8-Bit Single-Byte Coded
Sena Julia, Sena is one keyword and Julia Graphic Character Sets—Part 1.
is another). In the broader context of online
latinization
retrieval, any significant word or phrase in
See romanization.
the title, subject headings, or text associ-
ated with an information object. lead-in term
See used for term.
Keyword in Context (KWIC)
A type of automatic indexing in which leaf linking
each word in a text, title, subject heading, See node linking.
string of words, or term becomes an entry
leaf node
word in the index, with the exception of
Also called an external node. In a
words in stop lists. Variations on KWICs
thesaurus, a node that has no children, as
are KWOCs (Keyword Out of Context) and
with the ends or tips of hierarchical trees.
KWACs (Keyword Alongside Context).
lexeme
keyword index
A fundamental unit of the words of a
An index based on individual words
language, around which may be clustered
(keywords) found in a vocabulary term, text,
a set of words that are different forms of
or other content object.
the same word (e.g., paint is the lexeme for
label paints, painted ).
See heading.
lexical unit
language model Also called a lexical item. One or more
A type of automatic indexing based on term words that refer to a single concept (e.g.,
weighting and relevance prediction that flying buttresses or bills of sale). See also
attempts to predict probable query search bound term and compound term.
terms based on term frequencies within
lexical variant
documents and the inverse document
A term that is a different word form for
frequency of terms across the target data. It
another term, caused by spelling differences,
is similar to the probabilistic model.
grammatical variation, or abbreviations
last name (e.g., watercolor and water-colour ). Lexical
Also called a surname. In Western tradi- variants are considered as and grouped with
tion, the family name used with a first name synonyms in a vocabulary record, but they
to identify a person (e.g., Meier in Richard technically differ from synonyms in that
Meier ). See also first name and synonyms are different terms for the same
middle name. concept. See also synonym.
latent semantic indexing (LSI) link
A form of automatic indexing based on In the context of this book, any relationship
the co-occurrence clustering of terms in between two vocabulary records, two works,
combination with content that is associ- a work and image, or a work or image and
ated with these clusters; it attempts to an authority. Compare to hyperlink.
Glossary 225

literary warrant main term


Justification for the inclusion of a term in See descriptor.
a vocabulary based on published evidence
manual indexing
that is sufficient to prove that the form,
See indexing.
spelling, usage, and meaning of the term
are widely agreed upon in authorita- mapping
tive sources. See also organizational A set of correspondences between terms,
warrant, source, and user warrant. fields, or element names used for trans-
lating data from one standard or vocabulary
load
into another, or as a means of combining
The process of moving or transferring
terms or data for search and retrieval. See
files or software from one disk, computer,
also crosswalk.
or server to another disk, computer, or
server. To upload means to transfer from markup language
a local computer to a remote computer; to A formal way of annotating a document or
download means to transfer from a remote collection of digital data using embedded
computer to a local one. encoding tags to indicate the structure of the
document or data file and the contents of its
loan word
data elements. This markup also provides
In the context of a given language, a word
a computer with information about how to
that is taken directly from another language
process and display marked-up documents.
(e.g., sotto in su, an Italian phrase used in
HTML, XML, and SGML are examples of
English to mean painted in correct perspec-
standardized markup languages.
tive as if viewed from below).
material culture
local authority
A term referring to art together with the
An authority developed for local use.
broad realm of physical objects and
Although often compiled from one or more
edifices produced by a culture. See also
standard authoritative published vocabu-
cultural heritage.
laries, a local authority enforces preferences
and usage pertinent for the local setting. metadata
See also authority file and derivation. A structured set of descriptive elements
used to describe a definable entity. This
locator
data may include one or more pieces of
In a bibliographic index, the part of an index
information, which can exist as separate
entry that indicates the location of the book,
physical forms. In the context of art infor-
page, or other resource. In an online index,
mation, metadata includes data associ-
it may be a hyperlink to the source.
ated with information about the creation,
logical data model physical characteristics, history, location,
A data model that includes all entities and administration, or preservation of the work.
the relationships among them based on
Metaphone
the structures identified in a conceptual
A phonetic algorithm for matching terms
data model, and that specifies all attributes
and names by sound, as pronounced
for each entity. The data is described in as
in English, by translating words into a
much detail as possible, without regard to
standard code or representation. It was
how it will be implemented in a specific
developed by Lawrence Philips to address
database. See also conceptual
the perceived deficiencies in the Soundex
data model.
algorithm. Metaphone and its later improve-
logical record ments are available as built-in operators in
See record. a number of systems. See also Soundex.
226 Introduction to Controlled Vocabularies

metasearching real estate or other buildings. Distinguished


See federated searching. from built work.
microcontrolled vocabulary multilingual
Also called a microthesaurus. A Expressed in more than one language, as
controlled vocabulary that is limited in the distinguished from monolingual. In a
range of topics covered but fits within the multilingual thesaurus, terms and other
domain of a larger, broader, or more generic information may be expressed in more than
controlled vocabulary. It typically contains one language.
highly specialized terms that are not neces-
name authority
sarily in the broader controlled vocabulary
An authority containing proper names, most
but that map to the hierarchical structure of
often personal names. See also subject
the broader controlled vocabulary. See also
heading list.
derivation, extension vocabulary, and
satellite vocabulary. narrower term (NT)
Also called narrower context or child. A
middle name
record to which another record or multiple
In Western tradition, any name for a person
records are superordinate in a hierarchy
placed before the last name (surname) but
(e.g., Brewster chair is a narrower term to
after the first name (e.g., Alan in Richard
armchair). In thesauri, the relationship indi-
Alan Meier ). See also first name and
cator for this type of term is NT. Variations
last name.
on the notation include NTG (narrower term
minimal description generic), NTP (narrower term partitive), NTI
In the context of cataloging art, a record (narrower term instance), NT1 (narrower term
containing the minimum amount of infor- level 1), NT2 (narrower term level 2), etc.
mation in the minimum number of fields or
narrow results
metadata elements.
To adjust criteria in a search in order to
modeling retrieve a smaller number of more precise
See derivation. results that better match the intention of the
searcher. See also broaden results.
modifier
In a compound term or name, the adjectival natural language
component that modifies the noun (e.g., Spoken or written texts, as distinguished
flying in flying buttresses; Mount in Mount from fielded data and controlled
Etna). See also focus. vocabulary.
monohierarchy natural order form
A hierarchy in which each child has only In the context of a controlled vocabulary,
one immediate parent. Distinguished from a the form of a multiple-word name or term,
polyhierarchy. where the name or term appears in the
form that would be used in speech or a
monolingual
written text (e.g., Christopher Wren or
Expressed in a single language, as distin-
flying buttresses), rather than inverted (as
guished from multilingual. In a mono-
may be appropriate for an index). See also
lingual thesaurus, the terms and names are
inverted form.
expressed in only one language.
navigation
movable work
In the context of search and retrieval, the
In the context of cataloging art, any tangible
facility that allows users to move through
object capable of being moved or conveyed
a controlled vocabulary or other content
from one place to another, as opposed to
Glossary 227

object by using preestablished links structure of a source controlled vocabulary


or relationships. to link to more detailed controlled vocabu-
laries that are applicable to a single node of
near synonymy
the parent hierarchy. The vocabulary linked
Also called quasi-synonymy. The char-
to a broader vocabulary in this way is often
acteristic of a term with meaning that is
called an extension vocabulary.
regarded as different from another term, but
both the terms are treated as equivalents for nondisplayed index
the purposes of broadening retrieval. See A machine-readable index that is not
also synonym and true synonymy. displayed for browsing or other direct
access of end users, but is used behind the
neologism
scenes to improve accuracy or speed in
A term that has been newly invented, or an
search and retrieval. Such indexes may be
existing term to which a new meaning is
created beforehand or on the fly at the time
applied, often arising in the professional
of the query. See also displayed index.
literature of a discipline.
nonequivalence
nickname
In mapping one vocabulary to another, the
A familiar, affectionate, derogatory, or
situation where there is no exact match,
humorous name that is used to refer to
no term in the second language has partial
a person, place, or corporate body as a
or inexact equivalence, and there is no
replacement for, or in addition to, the real
combination of descriptors in the second
or official name (e.g., Masaccio, meaning
language that would approximate a match.
“big Tom,” is a nickname for the painter
See also exact equivalence and inexact
Tommaso Guidi ). (In the case of Masaccio,
equivalence.
in the ULAN it is the preferred name based
on literary warrant.) See also pseudonym. nonleaf node
Also called an internal node. In a hier-
NISO (National Information
archy, a node that links to one or more
Standards Organization)
narrower contexts. See also leaf node.
A nonprofit association that is accredited by
the American National Standards Institute nonpreferred parent
(ANSI) and identifies, develops, maintains, In a polyhierarchical thesaurus, any parent
and publishes technical standards to that is not flagged as preferred for use as
manage information. a default in displays. See also preferred
parent.
node
In the context of a thesaurus, any point or nonpreferred term
record in the hierarchy that is a location at Also called a nonpreferred name. Any
which a branch or individual record (leaf) term in a vocabulary record that is not the
is attached; thus, the basic conceptual unit preferred term, which is the term flagged
used to build hierarchies. as preferred for use as default in displays.
node label normalization
A word or phrase inserted into a hierarchy In the context of vocabulary retrieval,
to indicate the logical classification of the normalizing terms through a process of
terms beneath it. See also facet indicator converting a term to its simplest form
and guide term. by removing case sensitivity, spaces,
punctuation, and diacritics. It differs from
node linking
database normalization, which is the
Also called leaf linking. In the context of
process of reducing a complex data struc-
combining multiple vocabularies, a method
ture into its simplest structure, a technique
that uses various nodes in the hierarchical
used to eliminate data redundancy by
228 Introduction to Controlled Vocabularies

converting Unicode text into a standardized specificity of terms in a domain based on


form, among other things. the number of postings or links to that
term in a content object (e.g., a term
notation
that is linked to very few content objects
For a thesaurus, the alphabetic code used to
is predicted to be highly specific).
express term types (D, AD, UF), associative
relationship (RT), hierarchical relationships organizational warrant
(BT, NT, BTG, NTG, BTP, NTP, BTI, NTI, BT1, Justification for the inclusion of a term
BT2, NT1, NT2), and scope notes (SN), in a vocabulary based on the specialized
among others. See also classification requirements or jargon of the group or
notation. organization that is creating or sponsoring
the vocabulary. See also literary warrant
object
and user warrant.
See work.
orphan term
object-oriented database
In a thesaurus, a record that has no asso-
A data model where the universe is divided
ciative or hierarchical relationship to any
into a framework of classes, with each class
other term in the thesaurus.
containing instances or members (called
“objects”). Classes can contain subclasses, orthography
members of which inherit the properties of Correct or proper spelling and form of a
the parent or “superclass.” Rules and algo- word or words, including capitalization,
rithms for processing the data are integrated diacritics, and punctuation, based on
with the data. standard usage or convention.
online catalog paradigmatic relationship
In the context of art information, a type of Also called a semantic relationship.
system used by end users to search for and A relationship between terms or con-
view data and images. cepts that is permanent and based on
a known definition.
ontology
A formal, machine-readable specification parallel searching
of a conceptual model, in which concepts, See federated searching.
properties, relationships, functions,
parent
constraints, and axioms are all explicitly
See broader term (BT).
defined. While an ontology is not techni-
cally a controlled vocabulary, it uses one or parenthetical qualifier
more controlled vocabularies for a defined A qualifier placed in parentheses
domain and expresses the vocabulary in a for display.
representative language that has a grammar
parent string
for using vocabulary terms in an automated
The display of hierarchical parents in a
way to express something meaningful.
horizontal string, as distinguished from
operating system vertical indented displays or displays
Also called an operating system using notation.
program. A software program that runs
parsing
a computer, as distinguished from an
In processing data, a process where data
application program, which is designed
is broken or filtered into smaller, more
to accomplish a task for an end user (e.g.,
distinct units.
word processing).
partial equivalence
operational specificity
The relationship between terms in two
Also called postings specificity. An auto-
vocabularies where one term has a broader
mated method that attempts to predict the
Glossary 229

scope but is partially synonymous with the polyseme


other term. See also exact equivalence A word or lexical unit (e.g., a compound
and inexact equivalence. term) with multiple meanings; known as
a homograph in written language and a
partitive relationship
homophone in spoken language.
See whole/part relationship.
postable term
patronymic
See descriptor.
Also called a patronym. A word or words
used with a given name to identify a person; postcoordination
common in early Western personal names The process of combining two or more
when last names were uncommon (e.g., terms at the time of retrieval rather than
Bartolo di Fredi means “Bartolo, son of at the indexing stage; usually uses the
Fredi”); may also refer to a surname derived Boolean operators AND, OR, or NOT
from a paternal ancestor (e.g., Robinson (Baroque AND cathedral ) in formulating a
means “son of Robin”). query. See also precoordination.
permuted index posting
A type of index where individual words of In the context of indexing, any instance of a
a term are rotated to bring each word of the given indexing term having been assigned
term into alphabetical order in the term list. to records, documents, or other content
See also inverted form. objects. Formulas used for predicting the
usefulness of terms or methods of retrieval
phonetic matching
may count the number of postings relative
A process by which terms are matched to
to the target content objects or use the
other terms that are presumed to sound like
numbers of postings in other statistics.
the original term, in an attempt to compen-
sate for users’ misspellings or general vari- postings specificity
ation in spelling of names or terms (e.g., See operational specificity.
Meier and Meyer are pronounced alike).
precision
Phonetic algorithms—such as Soundex,
A measure of a search system’s effective-
Metaphone, and others—are used for
ness in terms of retrieving only relevant
indexing words by their pronunciation.
results; expressed as the ratio of relevant
physical feature records or documents retrieved from a
In the context of geographic information, a database to the total number retrieved in
characteristic of the earth’s surface that has response to the query. A high-precision
been shaped by natural forces, including search means that most of the results
continents, mountains, forests, rivers, and retrieved will be relevant; however, a
oceans. See also administrative entity. high-precision search will not necessarily
retrieve all relevant results. Recall and
pick list
precision are inverse ratios (when one goes
A user interface feature that allows the user
up, the other goes down). See also recall.
to select from a preset list of terms and is
typically used to control vocabulary for precoordination
indexing or to provide options in a query. The formulation of a compound term or
A pick list is generally populated with a multiword heading at the time of indexing,
controlled list. rather than at the time of retrieval. An
example of a precoordinated term is
polyhierarchy
Baroque cathedrals; an example of a
A thesaurus in which any record may be
precoordinated heading is United States—
linked to multiple parent records. See
History—Civil War, 1861–1865. See also
also hierarchy.
postcoordination.
230 Introduction to Controlled Vocabularies

predefined report procedure


A report for which the query and the output Also called a subprogram or subrou-
have been written and made available tine. A relatively independent portion of
for repeated use by users; users may be computer code within a larger computer
allowed to enter variables that are plugged program that performs a specific task in a
into the report. See also ad hoc query. series of steps.
preferred flag processing
A designation indicating that a term or other Also called data processing or informa-
data instance is preferred over others of tion processing. The manipulation or
the same type in a record. In addition to a transformation of data through a series
preferred term for the record overall, there of operations. In batch processing,
may be a preferred indexing name flag the operations are grouped together in
for the inverted order version of the term, a batches and performed automatically; in
preferred display name for the natural interactive processing, the opera-
order form of the name, a preferred role tions are prompted by input from a human
or preferred place type flagged among a programmer or user. See also computer
list of roles or place types, and so on. program.
preferred parent program
In a polyhierarchical thesaurus, the broader See computer program.
context that is chosen as conceptually
programming language
preferred; or, to serve as the default in hier-
A formal language defined by syntactic and
archical displays. See also nonpreferred
semantic rules and used to write instruc-
parent.
tions that can be translated into machine
preferred term language and then executed by a computer
Also called a preferred name. The term (e.g., SQL, C++, C#, Java, Perl).
designated among all synonyms or lexical
provisional term
variants for a concept to be used as the
See candidate term.
default term to represent the concept in
displays and other situations. In a mono- pseudonym
lingual thesaurus, the preferred term is A false or fictitious name, especially one
also the only descriptor in the record. In assumed by an artist, author, or other person
a multilingual thesaurus, there may be a to maintain anonymity or to designate an
descriptor for every language, but there is identity for a particular activity, among other
often only one preferred term for the record reasons (e.g., Le Corbusier is a pseudonym
as a whole. See also descriptor. assumed by the architect Charles Édouard
Jeanneret ). See also nickname.
preprocessing
Also called data preprocessing. Prelimi- punctuation
nary processing or transformation of data In the context of vocabulary terms, the
in order to facilitate further processing, marks from standard written communica-
parsing, etc. tion used to clarify, organize, or indicate
how a word or words should be read (e.g.,
probabilistic model
hyphen, comma, period, quotation marks,
An automatic relevance and weighting
parentheses).
method in which terms in a text or other
content object are modeled as random qualifier
variables so that term frequency and distri- A word or phrase used to distinguish a term
bution are used to predict the probability of in a vocabulary from otherwise identical
relevance. See also language model. terms that have different meanings. A
Glossary 231

qualifier is separated from the term, usually record


by parentheses. It is also called a gloss; Also called a logical record. In the context
although, strictly speaking, a qualifier of this book, a conceptual arrangement of
should be used only with homographs, and fields referring to a vocabulary concept or
a gloss has a more general meaning in the a work. This is different from a database
field of linguistics. See also homograph. record, which is one row in a database
table or another set of related, contiguous
quasi-synonymy
data. See also concept record.
See near synonymy.
record group
query
See group.
Also called a search. In the context of
retrieval, a command to look in a database related term (RT)
and find records or other information that A concept that is associatively (not hier-
meet a specified set of criteria (e.g., select archically) linked to another concept in
subject_id from term where normalized_ a thesaurus. In thesauri, the relationship
term like ‘A%’ and historic_flag = ‘H’;). The indicator for this type of term is RT. See also
most precise queries are those that return associative relationship.
the fewest false hits.
relational table database
query expansion (QE) Also called a relational database. A
Reformulating a query in order to return database in which data is organized into
a broader or more comprehensive set of columns and rows according to specific
results (e.g., adding synonyms to the user’s defined relationships (e.g., in a vocabulary
search term). database, a table of terms may be linked to a
table for languages).
recall
A measure of a search system’s effective- relationship
ness in terms of retrieving all results that In the context of this book, a link between
are possibly relevant, expressed as the ratio two types of data, records, files, or any two
of the number of relevant records or docu- entities of the same or different types in a
ments retrieved over all the relevant records system or network. See also link.
or documents. A high recall search retrieves
relationship indicator
a comprehensive set of relevant results;
A word, code, or other device used in
however, it also increases the likelihood
thesauri to identify the semantic relation-
that marginally relevant content objects will
ship between terms (e.g., UF), other fields
also be retrieved. Recall and precision are
(e.g., SN), or records (e.g., BT).
inverse ratios. See also precision.
relevance
reciprocity
The extent to which information retrieved in
In reference to vocabulary records, the char-
a search is judged by the user to meet the
acteristic of a two-way relationship in which
criteria of the query.
both entities have mutual dependence,
action, or influence on each other. Semantic relevance ranking
relationships in controlled vocabularies Ranking and sorting of query results, typi-
must be reciprocal, meaning each relation- cally estimated by an algorithm that calcu-
ship from one record to another must also lates the number and weight of occurrences
be represented by a reciprocal relationship of the search term in the targeted data.
in the other direction. Reciprocal relation-
report
ships may be symmetric (e.g. RT/RT) or
An organized set of data presented in a
asymmetric (e.g. BT/NT).
format suitable for viewing or printing,
232 Introduction to Controlled Vocabularies

typically produced by a preestablished also extension vocabulary, microcon-


query that may or may not have variables trolled vocabulary, and node linking.
that are manipulated by the user.
schema
repository Also called a scheme. In the context
In the context of art and related disciplines, of this book, the organization, structure,
refers to an institution, agency, or individual and rules for a set of data (e.g., the set of
that has physical or administrative respon- tables, views, indexes, and descriptions for
sibility for an art object, work of architec- columns in a database, or the organization
ture, or cultural object. and description of an XML document).
required fields scope note (SN)
Fields or data elements that are required to A note explaining the coverage, specialized
meet a standard or the requirements of a usage, and meaning of terms. In thesauri,
system’s operations. See also core fields. the relationship indicator for this note is SN.
reserved characters search
Letters, numbers, or symbols that have See query.
special uses or meanings in a programming
searching
or querying language.
Operations or algorithms intended to deter-
results list mine if one or more data items meet defined
The records or other data retrieved in criteria or possess a specified property.
response to a query and presented online or
see also reference
in a system in an organized display.
A type of cross-reference, usually in a
retrieval printed index, directing the reader to
In the context of this book, the activity of a related term or entry. A see also refer-
using a search or other method to find ence differs from a see reference in that
records or other data in a database. See the see also reference is not made between
also query. synonyms, but between terms or headings
that are more peripherally related.
romanization
Also called latinization. The conversion see reference
of a character or word expressed in a non- A type of cross-reference, usually in a
Roman alphabet or writing system (e.g., printed index, directing the reader from a
Cyrillic or Korean) into the Roman alphabet nonpreferred term or subject heading to the
by means of transcription, transliteration, or preferred term or subject heading for the
a combination of the two methods. same concept. The term or subject heading
at the see reference is a synonym for the
root
preferred term or heading.
Also called root node or top term. The
highest level of the hierarchy, from which semantic linking
all branches descend. A method of linking terms in a vocabulary
or larger database according to the
rotated listing
meaning of the terms and relationships
See permuted index.
between terms.
satellite vocabulary
semantic relationship
A thesaurus that is created with the inten-
See paradigmatic relationship.
tion of, or is later adapted for, linking to
another vocabulary that is larger, broader, SGML (Standard Generalized Markup
or more generic; it may be integrated at Language)
many points in the original vocabulary. See International Standards Organization
standard ISO/IEC 8879:1986; a markup
Glossary 233

language first used by the publishing graphs in the list—to sort by the parent
industry, for defining, specifying, and string). See also filing rules.
creating digital documents that can be
Soundex
delivered, displayed, linked, and manipu-
A phonetic algorithm for matching terms and
lated in a system-independent manner. XML
names by sound, as pronounced in English,
and HTML are derived from SGML.
by translating words into a standard code or
sibling representation. It was developed by Robert
A concept that shares the same immediate Russell and Margaret Odell and patented in
broader context (one level higher) as other 1918 and 1922. The National Archives and
concepts. Siblings are subordinate to the Records Administration (NARA) maintains
same broader concept and are at the same the current rule set for the official implemen-
hierarchical level. tation of Soundex used by the U.S. Govern-
ment. See also Metaphone.
single-to-multiple term equivalence
In the context of mapping terms from source
different vocabularies to each other, the In the context of building vocabularies,
situation that occurs when a term in one a citable reference to a term in the litera-
vocabulary has no direct match in the ture that helps establish its form, spell-
second vocabulary, but instead must be ing, usage, and meaning. See also
mapped to a combination of terms. literary warrant.
social tagging source authority
The decentralized practice and method In the context of this book, a bibliographic
by which individuals and groups create, authority file used to control the citations
manage, and share tags (terms, names, etc.) providing warrant for terms in a vocabulary
to annotate and categorize digital resources or information in a work record.
in an online “social” environment. See also
source language
folksonomy.
In the context of translating or mapping
software one vocabulary to a vocabulary in another
The components of a computer system language, the language of the original
that are not physical, including programs, vocabulary. See also target language.
procedures, algorithms, and documenta-
specialized vocabulary
tion pertaining to the operation of a system
See microcontrolled vocabulary.
and the performance of specific tasks, such
as word processing, Web browsers, photo specifications
editing, and art cataloging or vocabulary In the context of designing an information
editing. See also hardware. system, the formal, detailed description of
user and technical requirements, including
sorting
specific descriptions of procedures, func-
In the context of this book, the automated
tions, screens, reports, materials, other
process of organizing a results list, data
features, and hardware. See also user
elements in a record, or other data in a
requirements.
particular sequence based on established
criteria or attributes of the data—for specificity
example, alphabetically, by parent string, In the context of indexing, the degree of
or by an associated date. There may be precision or granularity used in assigning
primary sort criteria and secondary sort terms. Measures of greater specificity
criteria (e.g., an algorithm can be formu- include the use of the narrowest applicable
lated to first sort place names in a results indexing term rather than a broader, more
list alphabetically, and then—for homo- generic term. See also exhaustivity.
234 Introduction to Controlled Vocabularies

SQL (Structured Query Language) subfacet


A standard command language used with A major conceptual division of a thesaurus
relational databases to perform queries and that is located near the top of the tree but
other tasks. under a facet. Also called a hierarchy in
the AAT, although hierarchy has a more
standard
general meaning as well.
A vocabulary, set of rules, code of prac-
tice, or description of characteristics and subject
parameters that is documented, established In the context of this book, the focus
by experts, or approved by an authoritative concept of a vocabulary record (e.g., the
body and widely recognized or employed subject of a ULAN record is a person). Also
as an authoritative exemplar of correctness used to refer to the subject matter (often
or best practice; used within a discipline or iconographical content) of what is depicted
domain in order to promote interoperability in or by a work of art or the content of a text.
and efficiency.
subject heading list
statistical specificity An alphabetical list of words or phrases
See operational specificity. used to indicate the content of a text or
stemming other thing; characterized by precoordina-
In the context of mapping terms for search tion of terminology, meaning that several
and retrieval, the alteration of a term by unique concepts are combined in a string
automatically truncating or removing (e.g., Archaeology and art—China—
common suffixes, word endings, or History—20th century ). A type of
prefixes in order to find a match, usually controlled vocabulary. See also authority
applied to sets of related words that are heading and heading.
derived from a common root and appear in
subject indexing
a variety of grammatical forms (e.g., paint,
A term typically used in the context of
painting, painted ).
bibliographic cataloging but also applicable
stop list to cataloging art; refers to the application of
In the context of search and retrieval, words indexing terms to the content of the docu-
in a vocabulary or target data that are ment, as contrasted to a description of its
ignored in searching or matching because physical characteristics.
they occur too frequently or are otherwise of
subprogram
little value in retrieval for a given domain.
See procedure.
Common stop lists for a text contain
articles, conjunctions, and prepositions, subroutine
although these words are typically not See procedure.
included in a stop list for a vocabulary.
surface Web
string syntax See visible Web.
Also called string indexing. The creation
of headings by computer algorithm, charac- surname
terized by headings that are more consistent See last name.
than the typically idiosyncratic headings switching
created by hand (e.g., the automatic concat- In the context of mapping one vocabulary to
enation of a parent string in a heading for a another, refers to the use of a third vocabu-
geographic place, such as San Gimignano lary (a switching vocabulary) that itself
(Siena province, Tuscany, Italy) ). can link to terms in each of the two original
structure controlled vocabularies; useful when the
See data structure. original two vocabularies do not map well
Glossary 235

directly to each other. See also direct coordination with other descriptors (these
mapping. recommendations are now found in the AAT
Editorial Manual ).
symmetric relationship
In the context of a thesaurus, a reciprocal system
relationship that is the same in both direc- Also called a computer system. A
tions (e.g., RT/RT). See also asymmetric number of interrelated hardware and soft-
relationship and reciprocity. ware components that work together to store
and convert data into information by using
syndetic structure
electronic processing. In the context of this
Also called cross-reference links. In
book, a system for building and maintaining
the context of a vocabulary, refers to the
vocabularies, cataloging art, or performing
linking of equivalent, broader, narrower, and
search and retrieval. See also database.
other related terms so that they can be used
as cross-references to each other and to systematic display
related headings for the purpose of access. See hierarchical display.
synonym table
A term having a different form but exactly See data table.
or very nearly the same meaning as another
target language
term. See also near synonymy and true
In the context of translating or mapping
synonymy. Compare lexical variant.
one vocabulary to a vocabulary in another
synonym ring list language, the language into which the
A type of controlled vocabulary containing original vocabulary is being translated. See
terms that are considered equivalent for the also source language.
purposes of retrieval but do not necessarily
taxonomy
have true synonymy.
A classification organized into a hierar-
synonymy chical structure and applicable to a defined
A type of semantic relation in which two domain. Often used to refer to the clas-
words or terms have the same or very sification of living organisms according to
similar meaning. See also near synonymy physical characteristics, but the term and
and true synonymy. principles can be applied to classification
in any discipline. Unlike thesauri, taxono-
syntax
mies do not typically include synonyms
In the context of this book, the structure of
and associative relationships. See also
elements in a compound term or name (e.g.,
folksonomy.
last name first, comma, first name, middle
initial) or heading; also used to refer to term
the structure of elements in a search query A word or group of words representing
(e.g., rules for the placement of the Boolean a single concept; a vocabulary record
operators OR, AND, or NOT between terms); comprises terms and other information,
and analogous to the linguistic structure of including relationships, scope notes,
elements in a sentence. sources, etc. Additionally, in the jargon
of thesaurus construction, the word term
synthesis note
is often used as shorthand to refer to the
A brief preliminary finding, example, or
concept that is represented by that term
recommendation. This expression was
(e.g., BT and NT actually refer to the rela-
used in the original print publication of
tionships between concepts). The distinc-
the AAT to refer to bottom-of-page notes
tion between a term in the strict sense
throughout each subfacet (or hierarchy) that
and term meaning a record must often be
suggested ways in which descriptors from
inferred from the context of the discussion.
that subfacet could be combined in post-
236 Introduction to Controlled Vocabularies

term frequency (TF) seminormalized transcriptions, meaning


An automatic ranking method often used both substantive and accidental features of
in a formula with inverse document the original are retained, but abbreviations
frequency in information retrieval and text are spelled out using brackets or other
mining to measure how important a term is punctuation to distinguish the original from
to a set of data and how useful it will be in the editorial content.
retrieval.
translation
term record The process of changing a term or text from
In the jargon of thesaurus construction, the one language into another by interpreting
collection of information associated with the meaning of the original (source) term
a descriptor, including the history of the and expressing it as an equivalent in the
term, its relationships to other terms and second (target) term (e.g., copper mines in
records, etc. In this book, it is referred to as English is translated as mines de cuivre
a record (or a concept record) in order in French).
to distinguish it from the information that
transliteration
is actually associated only with the term
The process of rendering the letters or
table in a relational database model (e.g.,
characters of one alphabet or writing system
language of the term, contributor of
into the corresponding letters or characters
the term).
of another alphabet or writing system,
text generally based on phonetic equivalen-
In the context of this book, data that is cies. While a common noun will often be
not vocabulary controlled and generally translated, a proper name in a non-Roman
unstructured beyond the common structure alphabet is more often transliterated. There
of standard language expressions of are often multiple standards for transliter-
characters, words, sentences, or para- ating from one writing system to another,
graphs. See also free-text field. thus producing multiple variant names.
thesaurus tree structure
A controlled vocabulary arranged in a A controlled vocabulary display format in
specific order and characterized by three which the complete hierarchy of records is
relationships: equivalence, hierarchical, and shown or accessible by clicking. The tree
associative. Thesauri may be monolingual or structure may be constructed by assigning a
multilingual. Their purposes are to promote tree number or line number to each record,
consistency in the indexing of content and to or by another method. See also hierar-
facilitate searching and browsing. chical display.
top term (TT) true synonymy
See root. In thesauri, the relationship indi- The characteristic of terms or names
cator for this type of term is TT. that have meanings that are identical or
as nearly identical as is possible with
transcription
language. The purpose of enforcing true
In the context of cataloging art, the process
synonymy in a vocabulary is to increase
of recording a term or text word-for-word
precision in indexing and retrieval. See also
and letter-for-letter, including accurately
near synonymy and synonym.
copying capitalization, punctuation,
spacing, line breaks, illegible passages, truncation
and all other possible aspects of the orig- In searching and matching, the action of
inal (e.g., to accurately express the nuances cutting off characters in a search term
of an artist’s signature or an ancient archi- in order to find all terms with a certain
tectural inscription). Transcriptions in this common string of characters; typically
context are typically semidiplomatic or involves the user employing a wildcard
Glossary 237

symbol to search for a string of characters user interface (UI)


no matter what other characters follow The portion of the design and function-
(or sometimes, precede) that string (e.g., ality of a cataloging, editorial, search and
searching for arch* will retrieve arch, retrieval, or other system or Web site with
arches, architrave, architecture, architec- which end users interact, including the
tural history, etc.). arrangement of displays, menus, clickable
text or images, pagination, etc. A user inter-
trunk name
face that is easy for users to utilize is called
See focus.
user friendly.
typography
user requirements
The font style and size, and arrangement,
In system design, the initial formal explana-
appearance, and layout of words and texts
tion of functionalities, displays, and reports
on a page; in the context of this book, one
expressed from the point of view of the
of the critical elements in designing an end-
users’ needs and expectations. See also
user display of vocabulary records.
specifications.
Unicode
user warrant
A 16-bit character encoding scheme and
Justification for a term in a controlled
standard for representing letters, characters,
vocabulary based on the frequency of user
and diacritical marks in most of the world’s
queries that employ the term. User warrant
modern scripts.
may be used for terms intended for retrieval
unique identifier but is typically not sufficient warrant for
A number or other string that is associated posting a term in a thesaurus used for
with a record or piece of data, exists only indexing. See also literary warrant and
once in a database, and is used to uniquely organizational warrant.
identify and disambiguate that record or
variable
piece of data from all others in the database.
In a query, criteria or factors that may be
upload changed to produce different results (e.g.,
See load. as may be expressed in a where clause, as
the relationship type code in this query:
up-posting
select distinct subjecta_id from associa-
Also known as autoposting. The
tive_rels where rel_type_code = ’2110’;).
automatic generation of search terms or
See also criteria.
indexing terms by adding broader terms to
the specific term requested by a searcher variant term
or used by the indexer. See also generic In a vocabulary, a term that is not the
posting. preferred term but refers to the same
concept, including used for terms and alter-
used for term
nate descriptors.
Also called a UF. In thesaurus jargon, a
term that is not a descriptor and not an vector-space model
alternate descriptor. If the thesaurus is A method of automatic weighting in retrieval
being used as an authority, a used for term where an algebraic model is used for term
is not authorized for indexing. Used for frequency and distribution, creating repre-
terms typically comprise spelling or gram- sentative vectors in multiple dimensional
matical variants of the descriptor or have space; when compared to the vectors of an
true synonymy with the descriptor. incoming query, the relevance of results
may be predicted.
user
See end user. verbal units (VU)
In linguistics and computer science, the
phonemic, morphemic, or grammatical
238 Introduction to Controlled Vocabularies

clauses or units of language or texts, whole/part relationship


corresponding in part to syllables, letters, Also called a partitive relationship.
or words. A hierarchical relationship between a
larger entity and a part or component. In
visible Web
the context of cataloging art, it typically
The subset of the World Wide Web that is
refers to a relationship between two work
visible to Web browsers and can be indexed
records or two records in a thesaurus
by search engines’ Web crawlers or robots,
(e.g., Florence is part of Tuscany). See
in contrast to pages that are impenetrable
also genus/species relationship and
by search engines or to data that is gener-
instance relationship.
ated dynamically.
wildcard
visual arts
Also called a wildcard character or
See art.
wildcard symbol. In searching, a char-
vocabulary acter or symbol, such as an asterisk or
See controlled vocabulary. percent sign, that is used to represent any
other character or characters in a Boolean
vocabulary control
query or other string (e.g., the asterisk in
The process of enforcing the use of certain
Buonar*).
terminology with the goal of providing
consistency and improving retrieval. word sense disambiguation (WSD)
In automatic search and retrieval, the
warrant
problem of determining in which sense a
In the context of vocabularies, sources
homograph is intended in a given data set
that provide justification for the spelling
or text. See also disambiguation.
and usage of a term to refer to a particular
usage for a concept, including warrant of work
publications, common usage by experts of a In the context of this book, a creative
discipline, or other sources. product, including architecture; artworks
such as paintings, drawings, graphic arts,
Web browser
sculpture, decorative arts, and photo-
A software application that enables users
graphs that are considered to be art; and
to view and interact with information and
other cultural artifacts. A work may be a
media files on the Web (e.g., Internet
single item or may be made up of many
Explorer, Mozilla Firefox, and Safari).
physical parts.
Web site
XML (Extensible Markup Language)
A collection of related electronic pages
A simple, flexible markup language derived
(Web pages), generally formatted in
from SGML. Originally designed for large-
HTML and found at a single address where
scale electronic publishing, but now playing
the server computer is identified by a given
an increasingly important role in the publi-
host name.
cation and exchange of a wide variety of
weighted term ranking data on the Web.
See best match.

You might also like