• Embed Doc
  • Readcast
  • Collections
  • CommentGo Back
Download
 
1/5/11 9:20 AMDid I say “theoretical”? Openness and Google Books digitization « John Wilkin’s blogPage 1 of 7http://scholarlypublishing.org/jpwilkin/archives/12
John Wilkin’s blog 
John’s blog on libraries, librarytechnology, and pizza
Table of ContentsCommentsCommentersGeneral CommentsComments By SectionComments By UserGeneral Comments «The future of LIS programsDiscovering the Undiscovered Public Domain»
Did I say “theoretical”? Openness and Google Books digitization
I was recently quoted in an AP article (publishedherein Salon) as saying that Brewster Kahle’s positionwith regard to the openness of Google-digitized public domain content is “theoretical.” Well, I surethought I said “polemical,” but them’s the breaks. Brewster argues that Google’s work in digitizing thepublic domain essentially locks it up –puts it behind a wall and makes it their own–and that this is a lossin a world that loves openness. The contrast here is meant to be with the work of the Open ContentAlliance, where the same public domain work might be be shared freely, transferred to anyone,anywhere, and used f or any purpose. I don’t want to get into the quibble here about the constraints onthat apparently open-ended set of permissions (i.e., that an OCA contributor may end up puttingconstraints on materialsthat look worse than Google’s constraints). What’s key here for me, though, isthe real practical part of openness–what most people want and what’s possible throughwhat Michigan puts online.I think all of this debate begs us to ask the question “what is open”? For the longest time (since the mid-1990’s), Michigan digitized public domain content and made it freely viewable, searchable and printable.
 Anyone
, anywhere could come to a collection likeMaking of Americaand read, search and print to hisheart’s delight. If the same userwanted to download the OCR, that too was made possible and, in fact,theDistributed Proofreader’sproject has made good use of this and other MOA functionality. We didn’tmake it possible for anyone to get a collection of our source files because we were actively involved insetting up Print-on-Demand (POD), POD typically has up-front, per-title costs, and making the sourcefiles available would have cost us some sales that might otherwise pay for that initial investment. As wemoved into theagreement with Google, we made clear our intention to do the same “open” thing with theGoogle-digitized content, and to throw in our lot with a (then) yet-to-be-defined multi-institutional“Shared Digital Repository.” In fact, now we have hundreds of thousands of public domain works online,all of which are readable, searchable and printable by anyone in the world in much the same way.
 
1/5/11 9:20 AMDid I say “theoretical”? Openness and Google Books digitization « John Wilkin’s blogPage 2 of 7http://scholarlypublishing.org/jpwilkin/archives/12
This entry was posted on Friday, April 25th, 2008 at 10:01 am and is filed underdigitization. You can follow any responses to thisentry through theRSS 2.0feed. You canleave a response, ortrackbackfrom your own site.
Total comments on this page:«The future of LIS programsDiscovering the Undiscovered Public Domain»
Comments
by Commenterby SectionGeneral Comments
Blogroll
Au Courant (Paul Courant)Lorcan DempseyRoy Tennant’s Digital Librariesrss4lib (Ken Varnum)Suzanne Chapman’s usability blogPowered bydigress.it(version 2.3)So, what’s the beef? TheOCA FAQstates that for them this openness means that “textual material willbe free to read, and in most cases, available for saving or printing using formats such as PDF.” By allmeans! I hope it’s clear by what I wrote above that this is an utterly accurate description of whathappens when Google digitizes a volume from Michigan’s collection and Michigan puts it online. It’salso, incidentally, what Google makes possible, but even if Google didn’t, Michigan could and would berushing in to fill that breach. The challenges to Google’s openness always seem to ignore what’s actuallypossible through our copies at Michigan. This sort of polarizing rhetoric seems to be about making apoint that’s not accurate in the service of an attack on Google’s primacy in this space: we don’t wantthem to dominate the landscape, so let’s characterize their Bad version as being the opposite of our Goodversion. This notion that what Google does is closed is not an accurate description of Google’s versionof these books, and even less so a description of Michigan’s.Could the Google books be
more
open? Absolutely. Along withCarl Malamud, for example, I wouldlove to see all of the government documents that have been digitized by Google available for transfer toother entities so that the content could be improved and integrated into a wide variety of systems, thusopening up our government as well as our libraries. I believe that will happen, in fact, and that Googlewill one day (after they’ve had a chance to gain some competitive advantage) open up far more. In themeantime, however, when we talk about “open,” let’s mean it the way that the OCA FAQ means it. Let’smean it in the same way that the bulk of our audience means it. Let’s talk about the ability to read, citeand search the contents of these books, and let’s call the Google Books project and particularlyMichigan’s copies Open. Let’s stop being theoretical, er, I mean polemical.
 
1/5/11 9:20 AMDid I say “theoretical”? Openness and Google Books digitization « John Wilkin’s blogPage 3 of 7http://scholarlypublishing.org/jpwilkin/archives/12
20 general comments
 
Kathleen
says:Thisarticlefrom CNN says you said “theoretical” so it must be true. Of course, the article also says we’re doing scanning on the2nd floor of our book-shelving department, so what do they know! 
Carl Malamud 
says:> for saving or printing using formats such as PDFJohn, pardon me if I don’t grock Mirlyn, but I pulled up a public domain document (a congressional hearing). I was able to pull thetext up and page through, but there didn’t appear to be an easy way to save a single page, let alone the entire hearing. Perhaps thatfunction is available to Michigan students, but I suspect the rest of the citizens of Michigan are in the same boat as the rest of usingthe crippled interface.I think it is fine that Michigan and Google have their arrangement, but it is disturbing when we see a state-funded institution like U.of Michigan putting up artificial barriers to access.Your Mirlyn site is ok as far as web sites go, but letting a thousand flowers bloom always leads to more innovation. It would begreat if any grad student in Ann Arbor (or anyplace else) could download your govdocs docs and come up with a better userinterface.(In addition to more innovation, that policy would lead to a more informed citizenry, which is generally considered an importantpart of democracy and I suspect is part of your state-sponsored mandate.)Carl 
 jpwilkin
says:Gosh, Carl, I think the best way I can respond is not only to say that I whole-heartedly agree with your call for more vigoroussharing, but to point to my fourth paragraph, where I point to your work and urge the same thing. Look, my point is that while this isgood, and we are fighting for deeper sharing, this sort of thing is a fairly narrow piece of the openness issue.On your point about the functionality and the opaqueness of getting PDFs, we’ll take that into account in our usability. It’s there,and we can do better. I should not that for us larger PDF chunks is also a resource issue, but that we’re very close to releasing a newversion that gives you 10 pages at a time. Personally, I like the screen resolution PNG files and very much dislike PDF as a format,but that’s a usability position and not a philosophical one. 
Carl Malamud 
says:At the risk of having too many positions dancing on the head of a comment, being able to download/save a copy of the full doc is apretty key usability concern. 
 Brewster Kahle
says:John– while it may not be appropriate to start this in a comment, but I am quite taken aback by your seeming implication that“open” includes what google is doing and what UMich is doing.“Open” started to be widely used in the Internet community in association with certain software. Richard Stallman calls it “free”,but “open” has also come to be used as well. Lets start with that.“Open Source” in that community means the source code can be downloaded in bulk, read, analyzed, modified, and reused.“Open Content” has followed much the same trajectory. Creative Commons evolved a set of licenses to help the widespreaddownloading of creative works, or “content”. Downloading, and downloading in bulk, is part of this overall approach as we see it atthe Internet Archive.
of 00

Leave a Comment

You must be to leave a comment.
Submit
Characters: ...
You must be to leave a comment.
Submit
Characters: ...