You are on page 1of 30

Metadata:

An Introduction
lecture 4
Metadata
• The term "meta" comes from a Greek word that denotes
something of a higher or more fundamental nature.
Metadata, then, is data about other data.

• The term refers to any data used to aid the identification,


description and location of networked electronic resources
• Metadata is data that describes other data.
• Metadata summarizes basic information about data, which
can make finding and working with particular instances of
data easier.

• example, author, date created and date modified and file


size are examples of very basic document metadata.
The Purposes Metadata Serves

It also is an effective means of organizing electronic resources, which is an


important use given the growth in Web-based resources.

 metadata to describe resources enables its understanding by humans as


well as machines.

The main difference between data and metadata is:-

Data:-description of something, reading, measurements, observations, report


anything.

Metadata:- describes the relevant information , considered as


processed data.
Where metadata stored?

Metadata can be stored in a variety of places.

•often stored in tables and fields within the database.


•data dictionary or metadata repository.
Note1:-In the computer field, searching and storing
information is easy thanks to metadata.

Note2:-File has metadata from its creation, but it can be


manipulated or deleted with specific software
Metadata serves as
data description
data browsing
data transfer.

Note:- The metadata can also be used in planning for


new system?

How?
phone metadata
• Call metadata is a general term for different types of data collected
about phone calls, including
• Time

• Duration

• Caller's phone number

• Call recipient's phone number.

• Call metadata gives marketers helpful information for CEO of company.


websites Metadata

• Throughout the Web, metadata is used to describe


individual pages on a websites, allowing search engines to
understand what each page represents
Metadata Answers…
Who created the data?
Who maintains it?
When were the data collected?
When were they published?
Where was it collected (geographic location)?
What is the content of the data?
Why were the data created?
How were they produced/analyzed?
Critical Roles of Metadata
• Data Discovery
• To be able to identify important data sets
• Data Retrieval
• To know how and where to access data
• Data Use
• To know enough details about how the data were collected
and stored
• Data Archiving
• Data can grow more valuable with time, but only if the
critical information required to retrieve and interpret the
data remains available
Developing Metadata
• Identify the purpose of the metadata model
• Level of specificity of the elements
• Identify resources
• Infrastructure - who will supply it?
• What type of information package is it?
• Who will use the metadata?
• Existing metadata models
Why bother?
• To improve retrieval, i.e., to get an optimum balance of
precision and recall

• Precision – How many of the retrieved records are


relevant?

• Recall – How many of the relevant records did you


retrieve?
13
Types of Controlled Vocabularies

• Lists
• Synonym Rings
• Taxonomy
• Thesaurus
• Ontology

14
Lists
A list is a simple group of terms
Example:
Alabama
Alaska
Arkansas
California
Colorad

Frequently used in Web site pick lists and pull


down menus

15
Synonym Rings

• Synonym rings are used to expand queries for content objects


• If a user enters any one of these terms as a query to the system, all items
are retrieved that contain any of the terms in the cluster

• Synonym rings are often used in systems where the underlying


content objects are left in their unstructured natural language
format

16
Taxonomies

A taxonomy is a set of preferred terms, all


connected by a hierarchy or polyhierarchy
Example:
Chemistry
Organic chemistry
Polymer chemistry
Frequently used in web navigation systems

17
Thesauri

A thesaurus is a controlled vocabulary with multiple


types of relationships
Example:
Rice
BT Cereals
BT Plant products

18
Ontology

“An arrangement of concepts and relations based on an


underlying model of reality.”
• Ex.: Organs, symptoms, and diseases in medicine

•No real agreement on definition—every community uses


the term in a slightly different way.

19
How is metadata created?
1. By software tools
• Indexing robots, web crawlers
• From resource content, from server info
2. By human agents
• description by resource creator/owner
• description by third party services
•Creating (and maintaining) good quality metadata is not cheap

20
Steps to Create Quality Metadata
1. Organize your information
2. Write your metadata using a metadata tool
3. Review for accuracy and completeness
4. Have someone else read your record
5. Revise the record, based on comments from your reviewer
6. Review once more before you publish
How is metadata shared?

agreement on conventions
•Consensus on syntax
• use of XML
•Consensus on semantics of terms
• meaning of (uniquely named through XML namespace)
elements/attributes
•Consensus on meaning of structure
• use of community standard XML DTD/Schema

22
Metadata value for developers
• Avoid data duplication
• Share reliable information
• Publicize efforts – promote the work of a scientist and
his/her contributions to a field of study
• Metadata reuse saves time and resources in the long-run

CC image by US Embassy Guyana on Flickr


Metadata value for Data Users?

• Search, retrieve, and evaluate dataset information from


both inside and outside an organization
• Find data: Determine what data exists for a geographic
location and/or topic
• Determine applicability: Decide if a dataset meets a

CC image by ASEE on Flickr


particular need
• Discover how to acquire the dataset identified; process
and use the dataset
• Understand the dataset, including definitions of column
names, or expected numerical ranges found in the data
Metadata Value for Organizations?
• Metadata helps ensure an organization’s investment in data:
• Documentation of data processing steps, quality control, definitions,
data uses, and restrictions
• Ability to use data after initial intended purpose
• Allows organization to track data use and facilitates publication

• Transcends people and time:

CC image by mambol on Flickr


• Offers data permanence
• Creates institutional memory

• Advertises an organization’s research:


• Creates possible new partnerships and collaborations through data
sharing
Good Metadata Record?

• Identification Information

• Entities & Attributes

• Data Quality

• Access, Use & Liability Constraints

• Distribution

• Spatial References
Metadata Facts to Remember
• Metadata relates to more than the description of an
object.
• Metadata can come from a variety of sources
• Metadata continue to accrue during the life of an
information object or system.
• One information object's metadata can
simultaneously be another information object's data
Summary
• A metadata record captures critical information about the content of
a dataset

• Metadata allows data to be discovered, accessed, and re-used

• Standards and tools vary – select according to defined criteria such


as data type, organizational guidance, and available resources

• Metadata is of critical importance to data developers, data users,


and organizations

• Metadata completes a dataset.


Thank you
For attending the class

You might also like