/  5
 
 
5
The Evolution of Library DiscoverySystems in the Web Environment
I
n December 2008, the Orbis Cascade Alliance, a consortium o academiclibraries in Oregon and Washington,launched a new union catalog on OCLC’s WorldCat.org platorm. This changeresulted in an updated Web interace, bet-ter keyword searching, and aceted results.However, we also lost some eatures that worked well in our old system. But thelarger signifcance o this change might notbe obvious. A shit has taken place, one thatmoves us into a new paradigm or the sys-tems that support discovery o resources inlibraries. The Summit catalog is now part o a great global organism known as WorldCat,and that organism is poised to be more dy-namic and more ubiquitous than any o ourold local catalogs could have ever been. How did we get here? I will attempt to answerthat question through my personal accounto library search and discovery as a librarianand technologist since the mid-1990s.I entered library school in 1996. As the Web emerged, I developed a growing curios-ity or it and delved into HTML coding, Web programming, and Web server admin-istration. In those early days, the library community was just digesting the obviousadvantages that the Web had over previoustechnologies like Gopher and Telnet: mouseclick hyperlinking and richer graphics. Theunderlying discovery systems libraries usedcontinued much as they had in the past withprettier Web-based interaces on top.By the late 1990s some transormativechanges began to take shape in the onlinelibrary world and on the Web. In the library  world, ull text databases and services like JSTOR arrived on the scene, putting largeamounts o actual content, not just indexing,online. The general online ulltext databasebecame the bread and butter o our onlineoerings at Central Oregon Community College, which we were positioning to sup-port distance education. On the Web morebroadly, e-commerce gained ground andpeople got used to shopping experiences thatinvolved search, discovery, and ulfllment.In 1998 Google was ounded, and by the early 2000s it was the most popularsearch engine on the Internet. Google’sclever PageRank algorithm harnessed thecollective intelligence o the Web by usinghyperlinks to help determine relevancy. It was a system that benefted enormously rom the sheer scale o Google’s computingpower. More importantly, it got smarter asmore people used it. Google proved that a Web scale enterprise could achieve thingsthat small- and medium-sized players couldnot. In a similar way, dot-com crash survi-vors like eBay and Amazon established thatin certain markets there was only room ora ew large players on the Web. While Google was growing its searchbusiness, libraries mostly ignored searchand worked on the problem o organiz-ing a growing array o ull text resources.Libraries were acquiring access to electronic journals by the bucketul, but it was hardto fnd out i a given library had access to aparticular journal. By 2001, I had moved to Watzek Library at Lewis and Clark College,and one o my frst tasks was to developa way to search our electronic and print journals by title. In response I created adatabase that mixed together data rom ourILS and Serials Solutions and would latersupport an OpenURL resolver.In the early to mid-2000s, library catalogs began to adopt more o the trap-pings o mainstream e-commerce sites by incorporating cover art, external links, andancier Web design. They remained weak insearch unctionality. In 2005, major fguresin the library technology community like Andrew Pace and Roy Tennant began ask-ing rather loudly why OPACsearch let so much to bedesired when compared
by Mark Dahl
 Associate Director for Digital Initiatives and Collection Management, Aubrey R. Watzek Library,Lewis and Clark College
 
 6
 with commercial Web search (Pace 2005;Tennant 2005).Projects emerged that attempted tosignifcantly improve search unctionality in Web OPACs. They included North CarolinaState University Library’s catalog based onthe Endeca search engine and Casey Bisson’s WPopac (now Scriblio), an OPAC based onthe modular WordPress blogging sotware.In the early 2000s libraries also beganto break important new ground with digitalcollections mounted on systems such asContentDM and DSpace. These were thefrst Web-based discovery systems managedby libraries that harnessed the Web’s globalreach. Library catalogs largely contain reer-ences to books held by hundreds o librariesand are typically closed to search enginesbecause o the redundancy o their data. By contrast, digital collections contain uniquematerials and are generally open to searchengines, allowing people anywhere on theglobe to fnd and use their content.In late 2005 and early 2006 I co-au-thored a book,
Digital Libraries: Integrating Content and Systems,
with Kyle Banerjee andMike Spalti. We started work on the book  with a loosely-conceived thesis: that integra-tion o disparate content and systems with Web technologies could create exceptionalonline services or libraries. We argued thatlibrary systems, including discovery sys-tems, would be many dis-integrated unitstied together by standards and clever Webprogramming. Modular digital library toolslike OpenURL resolvers, electronic resourcemanagement sotware, and digital asset man-agement sotware, the trend toward OPACsrunning atop ILSs, and ederated searchingsystems that relied on new standards likeSRU/W (search/retrieve via URL or Webservice) all seemed to confrm this thesis.But as we researched the book in late2005, it became clear that this model didnot explain it all. More and more, users werebeginning to encounter library resources onthe Web outside the “walled garden” contexto library-managed discovery systems. Peoplemight discover books on Amazon or articleson Google Scholar and then acquire the con-tent via a library’s physical or virtual gateway.Moreover, Web 2.0 sites like Flickr, del.icio.us and YouTube allowed users to contributeand organize digital assets in a collectiveashion. Like Google, these Web 2.0 sites gotbetter as more people used them and aspiredto a Web-wide audience.In April 2006, I heard Lorcan Dempsey o OCLC give a presentation to the OrbisCascade Alliance Council on “Moving tothe Network Level: Libraries, Readers, and Applications.” Dempsey discussed the shitrom vertically integrating services withina single institution to “collaboratively sourcing” services in concert with externalplayers. The Alliance’s own union catalog, which aggregates supply and demand orbooks among 30+ academic libraries, servedas a strong example o regional collabora-tion. Dempsey encouraged the group tobroaden its thinking to resource sharingthat would involve “multi-level” collabora-tion between individual libraries, regional
The new Summit catalog has relevancy ranking based in part on library hold-ings as well as next generation catalog features like facets.
OREGON LIBRARY ASSOCIATION
 
 
7
consortia and global players like OCLC, JSTOR, and Google. He challenged thegroup to think about “painul” activitiesbeing done at the local or regional level thatcould be more eectively done by higherlevel organizations and systems.In some respects, the idea o outsourc-ing library systems to larger-scale play-ers went against my instincts. I’d alwaysenjoyed managing my own servers and writing my own Web applications. There was something inspiring about being ableto load Linux on an old PC and run my very own Web presence rom that little boxhumming away in the closet.Nonetheless, I couldn’t get the “movingto the network level” phrase out o my head.In late 2006 and 2007, I discovered that theidea related to the various Web applicationsthat I began using at work and in my person-al lie. Gmail revolutionized my productivity at work. I benefted rom its great search andorganization eatures, powered by Google’shuge inrastructure ar away rom my PC. At Watzek Library, we began using Basecampand Google Docs or project managementand collaboration. At a time when I support-ed a collection o digital images or teachingon MDID digital collections sotware, I wasimpressed with how much better Flickr man-aged digital assets. Meanwhile, buzz aroundthe concept o cloud computing grew,especially with the publication o NicholasCarr’s
The Big Switch
in early 2008, whichexplains how computing power in ar-away data centers is revolutionizing both personalcomputing and back-end IT inrastructure.In 2008, our library began implement-ing two network level discovery services.In winter 2007/2008, the Alliance struck adeal with OCLC to create a union catalogsolution based on the WorldCat.orgplatorm. WorldCat Navigator is a consortialversion o WorldCat Local that provides acatalog with the wide scope o WorldCat.orgbut with discovery and delivery eaturestailored to the needs o the Alliance.Given the growing shit in my thinking,I saw several advantages in the Alliance moveto WorldCat. The interace is more modernthan the old Summit and oers conventionsrom the consumer Web such as narrowingsearches by acets and creating user accountsor avorites. More compelling, however, isthe broader concept o having a catalog that isa part o a larger organic whole. The World-Cat database is a dynamic, ever evolvingthing, updated by a global community o catalogers. Unlike our local catalogs, where wedownload records and they remain mostly un-changed like a card in a card catalog, World-Cat operates like a Web 2.0 site: a community o people can cooperatively add metadatato improve digital objects, albeit in a muchmore regulated, library-world way. WorldCat’sglobal, ever changing holdings inormationallows WorldCat.org to have an unparal-leled relevance ranking o books, not unlikeGoogle’s PageRank concept. The WorldCat.org platorm also supports user-contributedcontent like ratings and reviews, a servicethat will be progressively more useul as morelibraries and users come on board.Moreover, with WorldCat.org, OCLCtakes a lesson rom Google and Amazonand understands that Web scale matters.In order or library content to be noticedon the Web, it needs to be presented by aglobal player, not in a diluted ashion romthousands o separately managed library catalogs. Unlike local library catalogs, WorldCat.org provides a place to reerencea book that is useul or anyone on the Weband maintains relationships with commer-cial search vendors so that its records willappear in search engine results. Further-more, it provides a catalog with commonconventions or searching and viewing re-cords not unlike Google providing a certainconsistency in its interace across the Web.
Vol 15 No 1 • Spring 2009

Share & Embed

More from this user

Add a Comment

Characters: ...