You are on page 1of 3

12/11/21, 7:06 PM EDIT Literature - Web Archiving - Library Guides at University of Washington Libraries

Home /  Start Your Research /  Research Guides


Home
University of Washington Libraries /  Library Guides /  UW Libraries /  Web Archiving /  Literature
Literature

Resources and Tools


Web Archiving: Literature       Unpublished  

This guide describe what is web-archiving, common tools utilized and resources
UW Web Archive
Collection Last Updated: Nov 9, 2021 11:48 AM Type/Group: General Purpose/UW Libraries

URL: https://guides.lib.uw.edu/c.php?g=1190997

Web Archiving
Subjects: [none]
 Tags: web archiving

Workflow

   Additional Resources   ×

Below you can find further literatures on the expansive topics of web-archiving, metadata and
ongoing project documenting new processes

 Add / Reorder  

Nov 2, 2021

https://uw.libapps.com/libguides/admin_c.php?g=1190997&p=8711172 1/3
12/11/21, 7:06 PM EDIT Literature - Web Archiving - Library Guides at University of Washington Libraries

Literature   × Glossary   ×

Web Archiving Archive:

Boss, K., & Broussard, M. (2017). Challenges of archiving and The process of
preserving born-digital news applications. IFLA Journal, 43(2), 150- capturing and storing
157. digital information
in a repository for
Ess, Charles M, Dutton, William H, & Brügger, Niels. (2013). Web
storage,
historiography and Internet Studies: Challenges and perspectives.
preservation, and
New Media & Society, 15(5), 752-76.
access. 
International Internet Preservation Consortium. "The WARC Format
Crawl:
1.0"
A web
Web Archiving Metadata
archiving operation
Dooley, J. (5 April 2017). "Best Practices for Web Archiving that is conducted by
Metadata: Watch This Space!" OCLC. an automated
OCLC. (2018). Descriptive Metadata for Web Archiving process is called
a crawler. Based on
Web Crawling
the seed urls that
Castillo, C. (2004). Effective Web Crawling. Dept. of Computer you input, crawls
Science, University of Chile. identify materials on
Olston, C. and Najork, M. (2010). "Web crawling." Foundations and the webpage that
Trends in Information Retrieval, Vol 4, No. 3. p. 175-246. belong in your
DOI: 10.1561/1500000017 collections. 
NT, B. (18 September 2018). "Top 50 open source web crawlers for
Repository:
data mining." Big Data Made Simple.
The physical storage
Selected Blogs on Web Archiving
location and medium
Stanford Library Web Archiving Blog for one or more
digital archives. A
Archive-IT Blog
repository may
 contain an active
copy of an archive
 Add / Reorder  
(i.e. one that is
Nov 2, 2021 accessed by end
users) or a mirror
Find E-Journals    × copy of an archive
for disaster recovery.
[Edit mode is disabled for mapped boxes]
Search for E-Journal Titles
Scope:
Search for electronic access to publications. Can't find it? Use UW Libraries
What the crawler will
Search to see if we have it in print or request it through Interlibrary Loan.
capture and what it
won’t.  Scoping
Sep 11, 2017
refers to options for
telling the crawler
 Add Box - Column 1
how much or how
little of a seed URL
to capture. Archive-It
options include seed
https://uw.libapps.com/libguides/admin_c.php?g=1190997&p=8711172 2/3
12/11/21, 7:06 PM EDIT Literature - Web Archiving - Library Guides at University of Washington Libraries
options include seed
and collection level
scoping.

Seed:

An item in Archive-It
with a unique ID
number. The Seed
URL tells the crawler
where to go on the
live web and acts as
an access point to
archived content. 

Seed URL:

The starting
point URL for a
crawler and access
point to archived
collections.

Sub-domain:

A directory named
before the root web
address, for
example crawler.archive
in which crawler is
the sub-domain.

 Add / Reorder  

Nov 9, 2021

 Add Box -
Column 2

 Add Bottom Box

Last Updated: Nov 9, 2021 11:48 AM URL: https://guides.lib.uw.edu/c.php?g=1190997  Print Page Login to LibApps

Tags: web archiving Report a problem.

https://uw.libapps.com/libguides/admin_c.php?g=1190997&p=8711172 3/3

You might also like