LIS 551 Final project Deli.cio.

us explanation and Thesaurus Jim Staskowski 12-2-05 Overview: To see the manner in which I organized the “orgoinfo” sites submitted to the deli.cio.us site, go to http://del.icio.us/jimstasko/project.

Notation: In trying to develop a controlled vocabulary while also trying to represent at the deli.cio.us site some of the structure I imposed upon the resources gathered, the affordancies of the deli.cio.us set-up led me to adopt the following rules of notation/syntax.

Dashes separate classes from classes. Underscore separates words separated in natural language.

Thus, the bundle “Internet-blog-blog_news” reads: Main class-Internet Class-blog Class-blog news I wish to emphasize that the deli.cio.us system seems to me really quite flexible, allowing you to do quite a bit in terms of displaying and arranging. For instance, late in the game, I began to experiment with whether one could create “see also” bundles. Namely, I was worried about my Cataloging/metadata bundle. What if somebody was looking for “metadata” as a subject, and did not think to associate this subject with “cataloging.” I wondered what would become of these users if they simply proceeded alphabetically through the bundles as arranged by the deli.cio.us system. Without some type of see also tag at where a metadata bundle would be, they might not find the metadata resources. Thus, I ended up

creating a dummy bundle called “metadata/cataloging” by using tags “project,” “catalog/metadata,” and the dummytag “see_also_cataloging/metadata.” This seemed satisfactory. However, there were features of the system that made it impossible to display my arrangement as precisely as I might have wished. I tried to overcome some of these “shortcomings” with odd little devices. For instance, I wanted to put the Resource and Definition bundles at the beginning of my list of bundles as formatted on the deli.cio.us interface. I wanted to do this simply because I thought that these bundles would not naturally fit into any possible hierarchy that I might impose upon the other resources. However, since deli.cio.us displays bundles alphabetically, this appeared not to be possible. I ended up overcoming the system’s automatic alphabetization of bundles by capitalizing the first letter of my Resource and Definition bundles. Thus, my resulting list of bundles began with the Definition bundles followed by the Resource bundles. This pretty much took care of the problem up to a point. Then, I began to think that while the Resource bundles would not fit with the rest of the bundles in any possible hierarchical arrangement, this was not necessarily true of the definition bundles. I began to think that one could see the definition bundles as a quick reference guide to the topics covered in greater depth by the remaining “topic” bundles. In practice, this was not really the case. There are too many gaps in the definition bundles, occurring where a topic is thoroughly explored by a topic bundle, but where there doesn’t exist a corresponding Definition or quick reference bundle. However, assuming that this might be corrected in time by future submissions, I did think that the Definition bundles should be grouped after the Resource bundles and just prior to the topic bundles. The only way I could affect this was by placing the numeral one before the Definition bundles’ labels. Two principles guiding organization In determining the categories and relations within my organization of the orgoinfo resources, I tried to pay close attention to the needs and interests of what I assumed to be the potential, primary user of such resources: LIS students. In envisioning that user, I relied on a couple of practices . One, I always tried to pay attention to what my classmates seemed to be interested in. I tracked this by not only paying careful attention in class, but by letting what was submitted to the site determine the shape of

the resources. In the first few weeks, I must admit that I grew frustrated by the sometimes chaotic nature of the submissions and the way that topics seemed to be somewhat unevenly covered. For several weeks, I genuinely worried that, if allowed, I would end up jettisoning a large number of the sites that classmates had submitted. Early on, I was stubborn in my definition of the resources, and frustrated accordingly. It seemed to me that a large number of sites submitted had very little relation to what I had presupposed the “topic” of the orgoinfo site was “supposed” to be. I rather quickly changed my thinking on this, realizing that what I took to be a handicap, or hindrance, was in reality the nature of cataloging type of situations which this exercise was designed to simulate. I had to recall that one of the primary potential users of the type of organization I was trying to create were LIS students, and if the interests and needs of an actual body of LIS students seemed messy or disorganized or chaotic, then very likely, the needs and interests of LIS students as a whole were messy, disorganized and chaotic. I tried to reimagine my situation as a luxury of sorts. Many catalogers are forced to imagine their user with very little information to go on. In addition to being one, I was lucky enough to have ongoing contact with a range of LIS students while doing this project and did not to have to fall back upon my imagination. However, it is interesting to note that in cataloging these resources to suit this user, I often wished that I might have had the luxury of relying upon the free will of my own imagination. In trying to guess how a user might think, I relied upon other sources than the twenty students in UW SLIS 551 who submitted sites. Number one, I turned to myself, although always with caution. I wanted to avoid privileging my voice. I also tried to imagine students in future years. Thus, while there are resources clearly related to our particular class, I kept these separate. I tried to keep these types of bundles to a minimum and to label them as quite clearly specific to our class. In truth, this wasn’t really a big problem, but I can easily imagine an organization suffering by the organizer paying too close attention to a particular group of users close at hand at the expense of a wider group of users. However, as I neared the end of this project, it became clear that I had privileged to some extent the interests and habits of those twenty or so people who had submitted sites. This is reflected in the fact that the Resource bundles, categories which are quite specific to this small coterie of users, are placed right

at the top of the list of bundles. Moreover, I did not and ultimately would not second guess this group and add topics in the interest of rounding out what might seem to me or to Arlene Taylor a "standard" body of topics related to information management. Likewise, I did not and would not eliminate categories either, even though there are a number of almost orphaned classifications only tangentially connected to the larger groupings. These little, frail categories prevented me from being able to create a nice, symmetrical, and logical seeming hierarchy of categories. So it is. In the end, I really can't imagine too many potential users of this site who might use it as frequently as those original twenty submitters might. Early on in the class, you mentioned a variant of the eighty/twenty rule which encouraged designers of systems to focus on small groups of users who are likely to use a system quite frequently, even if to do so will potentially inconvenience or turn off a larger number of users who will likely use the system less frequently. I didn't consciously set out with that rule in mind, but it did unconsciously shape my organization. Moreover, if I were starting over, I would pay even more attention to it. Finally, in regards to pleasing the user, I imagined people might want to use these resources later, once they’ve entered into the profession. This may be a stretch. However, I thought that if they did so, some of the material would be much more useful and/or relevant than other parts of the material. Thus, I separated out bundles, like “librarary-design,” “librarianship-funding,” and “librarianship-news,” I.e. sites focused on issues surrounding the running of libraries on a daily basis, even though these bundles were often meager and might have easily been bundled in a larger bundle like “librarianship.” Also, one of the reasons I kept and separated out the Resource bundles was because I thought they might be useful to a librarian working a reference desk. My second guiding principle in organizing the collection was enhancing serendipity. One of the greatest potential handicap of digital collections is that they often don’t provide the same possibilities for serendipity arising from collocation that traditional, books-on-shelves collections do. I don’t think that this has to be the case. To enhance collocation within my organization of the materials, I tried a couple of things. First, with the larger, root or parent categories, I would often leave them quite large, often grouping resources within them that could also be found within sub-class bundles below them in the hierarchy. I wasn’t sure about doing this in all cases. Sometimes, as in the “internet” bundle, the resulting bundle

simply was too large, and the resources gathered too diverse. This didn’t create collocation but clutter. Throughout the course of arranging the materials, I often changed my mind, cleaning up and streamlining the parent or root bundles, then restocking them with everything in the subclasses below them in the hierarchy, and then a week later, streamlining them again. As a result, as of today, there is not a uniformity to the root or parent bundles. Some have all the sites in the subclass bundles below them, and some do not. Again, I had a hard time determining what was useful collocation and what was clutter.

History/explanation of organizing process As stated earlier, I began to catalog the items submitted from early on. In developing categories, in settling upon the terms or labels for those categories, a variant of literary warrant guided me: I tried to and was often forced to follow the lead of the tags applied by the person submitting the item. I did so because I had no other clear idea of what the future shape of the entire body of information might look like. I fully expected that the majority of sites submitted would largely be related to the class, or centered on information management issues largely pertinent to the field of librarianship. I also expected people to take advantage of the forums provided to contest submissions that seemed irrelevant. If my memory serves, I originally started out with four overarching bundles: “Librarianship,” “Information management,” “Newsreader/RSS,” and “Technology.” These categories seemed to accommodate the first few weeks of submissions. Thus set up, the first problem that I encountered were the rather large number of submissions that seemed entirely unrelated to both my early/provisional categories and what I had supposed to be the "LIS/Information Management" domain of the collection. There were not only more “oddball” sites than I expected, but they just continued to be submitted throughout the course of the project. However, these sites that I saw as oddball for not seeming to be LIS related, did share one of two common features. One group of these oddballs were sites that I suspected people found amusing or interesting. These amusing/interesting oddball sites were often related to the submitter’s hobbies or political bents. Informally, I termed these “share” sites, as in, “let me share something of myself with you.” For example, one individual submitted the Chicago Reader site and articles profiling Ellen Goodman and Molly Ivins,

while another submitted a site titled “Bush Nephew Arrested- September 16, 2005. The second collection of oddballs were sites that were wonderful sort of omnibus, gatherings of information, such as the Wikipedia, or the Project Gutenberg site. At first, I thought that they were submitted because the submitter thought they were good examples of information management. However, although some were, I believe most of these sites were simply being shared. The person submitting them found them useful in their daily lives/work and wanted to make sure others knew of these sites. Some of these were commercial sites that many people in the class, and I suspect many in the profession, turn to on a frequent basis, such as the Barnes and Noble main page or the Google Print site. Regardless, My first impulse was “to rule” both types of share sites out of the classification, and I began by tagging all of them “Question.” The second problem that I soon encountered was my bundle “Technology.” It didn’t take many weeks of submissions before it became clear that this was too broad a term, and that the items I had grouped within it were too diverse. Within this category I had sites dealing with XML, dvd players, internet jargon, and articles related to google. In addition to the problem “technology“ bundle, I realized that my Newsreader/RSS” bundle seemed awfully small and odd compared to the rest of my groupings. I had assumed that this topic would feature more prominently in the class, and the first few weeks of submissions did not persuade me otherwise. However, it soon became clear that it would not really stand as a main class. However, I was reluctant to dump one more oddball into the mishmash that was my technology category. In some respects, I never quite resolved this problem. Moreover, while I whittled technology down to basically mean “personal information/entertainment devices,” I ended up resorting to categories like “internet” which are easily just as equally broad. I eventually resolved my problem with the “share” sites (I.e., the “let me share myself” and the “let me share this site” sites) by creating two categories. The first was question mark, and here I began to place all submissions that clearly did not seem to fit with what I had supposed (perhaps falsely and narrowmindedly) to be the overall topic of the collection, LIS and/or information management. I was reluctant to do so since I wasn’t certain whether it was within the rules of the assignment. However, I also did not know how to incorporate a site on the Shih-Tzu into any possible classification without creating large numbers of orphan categories. My second solution was to create a category that was originally titled

“Online Product Resource for the oddball submissions that were rich in information. I thought that this one general category would serve to bundle all the odd “ hidden treasure” sites, figuring that people would eventually tire of submitting these sites once they realized that many people in the class were already familiar with them. The Oxford English Dictionary site is wonderful, but it is hardly a secret site, especially in a LIS class. However, I was wrong in this. Moreover, I began to develop an appreciation for these sites as reference sites. They were not sites directly related to information management, but they were sites that I could imagine a librarian turning to on a regular basis. The more I developed an appreciation for how they might be used, I began to feel that these sites deserved more organization than I‘d originally intended. After all, convenience sites should be convenient. So, I began to break them down into subject categories like art, art-film, music, and general. I perhaps went too far in breaking them down by adding cooking and travel, but doing so helped me to avoid simply discarding these sites into my “question” category. In general, I wanted to “rule” on matters very little, and was most reluctant to place things in the “question” category. Also, since I re-checked my question category on a regular basis to see whether in the course of time categories had developed which might accommodate some of the oddball items, I was wary of just dropping a lot of items in there and thus making this rechecking process into a chore. With time, I altered and added to my original four categories. As more sites were submitted relating to search engines, I broadened my “Newsreader/RSS” category into a “browser” category that included the newsreader material as well as the search engine material. Eventually, “browser” became “browser/search engine” with a sub-category of “Newsreader/RSS.” In the end, after reconsideration, I realized that a newsreader wasn’t really a search engine, and I ended up moving it into the internet category I created rather late in the process. Originally, when cataloging and metadata resources began to be submitted in large numbers, I placed these in the “information management” category that I had created. However, I was dissatisfied with this. It seemed that the resources gathered in this category had grown too diverse and unrelated. Moreover, within this category, a large number of cataloging and metadata resources “dwarfed and hid” the other resources gathered. So, I separated them out into their own categories. I think that this was wise. I

believe that people looking for such resources are more likely to find them in a speedier fashion when they are bundled apart into specific “cataloging” and “metadata” categories. However, there are problems with this arrangement. When I removed the metadata and cataloging items, it left my “information management” category a rather bare and odd assortment of items. Moreover, “cataloging” and “metadata” ended up root or main categories, yet it didn’t seem that I could logically relate them to the other root or main categories I was working with. I think of this as a problem of asymmetry and will comment upon it a bit further on. Early on, I began the process of creating the “Definition bundles.” These came into being through an assessment of my own needs as an LIS student taking a course in information management. In the first few weeks of the course, I often encountered terms, especially IT terms, that made it hard for me to precisely determine just what the content was of some of the sites submitted. In trying to get a quick, thumbnail definition on such terms and to properly catalog the sites, I often had to hunt around the web to find other sites that provided elementary introductions to the topics or terms in question. I didn’t often have to hunt around for long. As another classmate remarked, the wikipedia is an invaluable aid in providing often clear, simple, and quick definitions for complex IT and LIS terms. This gave me the idea of creating within my organization bundles of sites that provided such quick definition for LIS/IT topics covered in greater depth in sites bundled under my topic categories. These can be found right after the “Resource” bundles in the listing of bundles at http://del.icio.us/jimstasko/project I think that the definition bundles were a good idea that did not quite come off. Unfortunately, too often, there were areas where there were simply no sites submitted that provided quick explanation for a given topic, and thus there are strange discrepancies between the topic arrangement within the definition bundles and the topic arrangement as a whole. There are all sorts of resources devoted to search engines, but just two sites within the definition bundle on this topic. Also, there are four XML/HTML sites within the “1-Definition-concept-XML/HTML,” but there is not a corresponding topic bundle. To correct such “asymmetries,“ I thought of going out and finding sites. I decided against this and it was not simply out of laziness. I decided against doing so because I thought that to do so was a “violation” of my desire to let the arrangement take shape according to the desires and inclinations of the

twenty users submitting sites to the collection. Thus, more definition resources might give the collection a more logical symmetry; however, if the twenty users submitting sites did not see the need for such sites, I was hesitant to disagree. Of course, I too was a user as well as being the organizer of the resources, but I tried to avoid falling into the unexamined habit of having my voice be the equivalent of the other twenty voices shaping the collection. At this time, I began to add topic categories whenever there seemed to be more than a handful of submissions related to a topic. Thus came into being the “copyright” bundle, the “google” bundle, the “privacy” bundle, a “systems design” bundle (with a “systems design-interface” bundle sub-class), and a “digital library“ bundle. If a topic interested me, there did not even have to be many submissions before I created a bundle. Thus, in hope, I began an ill-fated “information overload” bundle that never gained more than it’s original, single submission. Early on, I began to worry that I was creating an organization that would end up too broad and shallow (see Rosenfeld and Morville, Information Architecture for the World Wide Web, Chapter 3, p39). In the hopes of adding depth, I began to look towards combining and relating what I feared was an overly large collection of topic bundles. Sometimes, it was quite easy to see the connection between topic bundles. Thus, “digital library” eventually became a plausibly logical sub-class of “librarianship.” However, the relationship between other topics that I ended up combining did give me pause. When I was eager to combine bundles, I came across Taylor’s rather strong argument for seeing cataloging and metadata as closely related processes (see Taylor, The Organization of Information, Chapter 6, pp.144145). Thus prompted, I decided to bundle my “cataloging” and my “metadata” bundles into one “cataloging/metadata” bundle. In the end, I’m not entirely sure that this was a wise choice. The biggest problem in bundling the two concepts is that many users might not think of the two as being related. I think this is especially true of users who are most familiar with metadata. This problem situation was further compounded by the fact that I had labeled the category “cataloging/metadata” and not “metadata/cataloging.” If I had more time, I might again break this category down into two. As it is, I tried to create cross-references within my deli.cio.us bundle list so that whether a user began by looking for cataloging resources or whether they started by looking for metadata resources, they eventually come to the

“cataloging/metadata” bundle. I still think that there are a lot of great possibilities for serendipity that are created by bundling the two concepts within one category. Then, there were categories that I eliminated because it wasn’t clear that I could devise a scope note to cover the resources or where the resources under a bundle were more logically bundled under a number of other bundles. The “system design” bundle was discontinued for these two reason. Unfortunately, this “system design” bundle did have one significant grouping of closely related resources within it that were rather important to an area that was often discussed in our class and class readings: interfaces. I ended up having a hard time knowing what to do with this essential “child” category after I had folded up its “parent” category. I tried to see if any of the other categories might adopt it, but could not quite see a logical fit. Both the “browser/searchengine” bundle and the “cataloging/metadata-examples_of_catalogs” bundle could make equally compelling cases for adoption. In general, I feel that there is too much asymmetry in the organization I arrived at. As of the writing of this paper, my arrangement seems not only to be too broad and shallow, but also to have radically different depth levels along the course of its breadth. Yet, I’m not quite sure too what degree this is a problem and to what degree it really needs or can be fixed. Rosenfeld and Morville advise, “For new web sites and intranets that are expected to grow, you should lean towards a broad and shallow rather than narrow and deep hierarchy”(Ibid. p.38). And although the comment is directed at the design of web sites, I also take comfort from their reminder that “the heterogeneous nature of web sites makes it difficult to impose highly structured organization systems on the content.” (Ibid., p.25). It was also difficult to impose a highly structured organization on the resources given the rather loosely defined/understood collection development policy that guided the “acquisitions” process.

Alternative ways the resources might be organized As it is now, my arrangement is a hybrid of a classification organized by subject/application (see Zins, “Models for Classifying Internet Resources,” Knowledge Organization, 29(1), p.23). I basically arranged my resources according to subject, while separating out those that made good, quick reference sites. I never did this as comprehensively as I might have, especially if I had felt free to go out and add

sites for the purposes of rounding out my idea of how the resources should be organized. Within that general subject/application model, I notice that I also at a points separated out items by source format, as with “blogs_news_and _views” and “librarianship-news.” Although I did this a bit unthinkingly, I actually like the idea in that it separates out items that are topical and covered at newspaper/magazine depth. It makes it easy for both users looking for such resources and for users looking to avoid such resources. In some ways, some of the features that I used suggested to that the resources might be more usefully organized according to user. Within such a scheme, three large categories would serve: Librarianto-be, Librarian-at-library, and Librarian-at-home. Thus, theoretically, the resources now under the schooling bundles, the definition bundles, and the more technical and theoretical sites grouped in bundles like “cataloging/metadata” and “copyright,” all the “interface” and “catalog examples” would be found under the main class “Librarian-to-be.” Sites devoted to the operation of libraries, sites within the present “cataloging/metadata” or “blogs” bundles that offered reference or applications in regards to these topics, sites devoted to searching databases and the internet, these types of sights might be grouped under “Librarian-at-library.” Lastly, the numerous amusing sites, the hobby sites, the general interest sites, the sites now under question, could then be arranged under the main class, “Librarian-at-home.” There would be overlap, with quite a few sites finding themselves falling under two main classes. Thus, there are blog sites that would naturally fall under likely bundles like “Librarian to be-blogs” as well as “Librarian at library-blogs.” If there is one flaw in this scheme it is that too many of the resources could and should be placed under more than one of the main classes. Consequently, it might be that you would be largely reproducing the same sub-class structure and content under two main sub-classes.

Definition Topics Browser/search engine Cataloging/metadata Copyright Internet Social software

Resources

Librarianship

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.