Making sense of weblogs in the intranet

What they are, why people are using them, making them useful for knowledge management
Usability Professionals Association 16 September 2003

Michael Angeles michael@studioid.com http://studioid.com

Thank you. Today I’m going to talk about weblogs inside my company, their use in knowledge management, and how my organization is hoping to make them usable for enterprise knowledge work if the number of blogs in the company increases significantly. I’ll talk briefly about our company and the types of people involved in various forms of web publishing on the intranet. Then I’ll look more closely at what weblogs are, how people use them, and how we might develop information systems to make usable, the data that gets published from these weblogs.

Disclaimer

But, first... a disclaimer. I like it when documents such as functional specifications start out with a disclaimer -- a discussion of what it doesn’t cover as well as what it does. I’d like to try to introduce this presentation in the same way so you know that not everything I’m talking about has been implemented. Much of this talk has to do with strategy and positioning.

What this is
A discussion about weblogging for knowledge management within corporations A discussion of my organization’s role -- how we view ourselves in terms of providing weblogging support A look at our long view -- how we’re planning to support webloggers

So here’s what this presentation is going to be... This is going to be a discussion of a phenomenon occurring within corporations * Namely the proliferation of intranet weblogs for knowledge management. * I’m also going to talk a little about how weblogs affect corporate intelligence and IT. This is also a discussion of how my organization has analyzed and is planning to deal with weblogs. * I’m going to talk a little about how we’re supporting bloggers presently. * And I’m going to talk about how we, as the company’s information management organization, are positioning ourselves to deal with any information growth as a result of blogging. The disclaimer part is that we have NOT implemented all of our ideas yet, though we have the technology and resources to implement them. The technical implementation, as you will see is trivial when compared to the strategy and resources required to actually pull off some of the ideas we’ve kicked about.

A history of web publishing in my intranet
Before we get into the nitty gritty of weblogs ... a very brief and incomplete history of Lucent intranet web publishing How web publishing has evolved Who’s needs are being met by web-based publishing Let’s start with a timeline

But before we get into the nitty gritty of what weblogs are and before I start throwing out buzzwords I want to give you some idea of how web publishing has evolved in our intranet And then want to look briefly at the different people involved in publishing corporate information on the intranet and why they need to do it

First there was the command line
Technologies

Internet protocols (Archie, FTP, telnet)

NCSA Mosaic (11/ 1993)

Company Milestones IIS Milestones LINUS (Clientserver)

Pre-web

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

Web browser timeline: http://www.blooberry.com/indexdot/history/browsers.htm

From a thousand miles up and with the benefit of hindsight, we can see where the company has gone with web publishing on the intranet. In pre-web days the library organization’s electronic resources were accessed using LINUS, a shell interface that you accessed by telnetting into our UNIX server. This was a hierarchical menu interface that dumped you into oru databases, which used command-line search syntax identical to Dialog, a large database aggregator popular with researchers.

Then came pictures
Technologies

Internet protocols (Archie, FTP, telnet)

NCSA Mosaic (11/ 1993)

Netscape Navigator 1 (12/1994)

Company Milestones

Simple sites proliferate; Hand editted and FTPed; Front Page webmasters (1995-96)

IIS Milestones

LINUS (Client-server)

InfoView Digital Library (1994-1995)

ISG created; produces customized db-driven BU intranet sites (7/96)

Pre-web

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

Then Tim Berners’ Lee wrote the specifications that became HTTP and HTML and the web was born. Most web pages at this early stage of our intranet are all text and almost all sites are probably marked up by hand in vi or emacs. Later people start to use WYSIWYG editors like Front page. In 1996 my organization begins to hire staff to produce web interfaces for customer databases and web sites and we begin to get more heavily involved in doing db-driven web-based information systems for business units.

Then useful data competed for screen space
Technologies

Internet protocols (Archie, FTP, telnet)

NCSA Mosaic (11/ 1993)

Netscape Navigator 1 (12/1994)

Company Milestones

Simple sites proliferate; Hand editted and FTPed; Front Page webmasters (1995-96)

ONSource, first BU portal (1/1999)

IIS Milestones

LINUS (Client-server)

InfoView Digital Library (1994-1995)

ISG created; produces customized db-driven BU intranet sites (7/96)

IIS ISG indexing supports BU portals with process introduced indexed (1998) content (1999) Business taxonomy development (1998) 1998 1999 2000 2001 2002 2003

Pre-web

1994

1995

1996

1997

Then useful data started to crowd and compete for screen space when the first business unit portals arrived. ON Source is probably the most successful large-scale web site implementation I’ve seen in the company. It was the result of an Optical Networking Group team that worked with analysts who came into the organization to do interviews with Optical Networking knowledge workers to find out what they looked for to do their jobs, where they looked and how much time they spent looking. After an extensive report was created describing their prospective users and estimating the amount of money spent per person searching for information, the functional specifications for this portal started to come together. Our organization was brought in to develop a metadata schema including an Optical Networking subject taxonomy, and company taxonomy which was then expanded to include all of the product and research areas at Lucent. IIS then began to modify its applications and indexing processes to incorporate these subject taxonomy terms and classified data going through our organization began to feed the portal.

The bubble bursts and standards are born
Technologies

Internet protocols (Archie, FTP, telnet)

NCSA Mosaic (11/ 1993)

Netscape Navigator 1 (12/1994)

Company Milestones

Simple sites proliferate; Hand editted and FTPed; Front Page webmasters (1995-96)

ONSource, first BU portal (1/1999)

Portals close; subdomains removed Migration to MyLucent begins (2000-2001)

MyLucent, Company portal (7/2001)

IIS Milestones

LINUS (Client-server)

InfoView Digital Library (1994-1995)

ISG created; produces customized db-driven BU intranet sites (7/96)

IIS indexing process introduced (1998) Business taxonomy development (1998) 1998

ISG supports BU portals with indexed content (1999)

ISG ceases to produce custom sites (2000)

Pre-web

1994

1995

1996

1997

1999

2000

2001

2002

2003

Then a wierd thing happened. The bottom fell out when the dot com bubble burst. Telecom was hard hit and from up high every executive and senior manager was looking for ways to cut costs. So corporate standards were discussed for a long time and we began getting involved with an initiative to migrate all of the company’s separate intranet sites into one company portal. I remember hearing about the long meetings that seemed to go on for months around this topic. In the end, the Oracle Portal server was selected and is now running the corporate intranet. My group stopped doing custom-information services involving new web site development.

Then the bottom really falls out
Technologies

Internet protocols (Archie, FTP, telnet)

NCSA Mosaic (11/ 1993)

Netscape Navigator 1 (12/1994)

Company Milestones

Simple sites proliferate; Hand editted and FTPed; Front Page webmasters (1995-96)

ONSource, first BU portal (1/1999)

Portals close; subdomains removed Migration to MyLucent begins (2000-2001)

MyLucent, Company portal (7/2001)

Much of CIO supporting MyLucent is laid off (2003)

IIS Milestones

LINUS (Client-server)

InfoView Digital Library (1994-1995)

ISG created; produces customized db-driven BU intranet sites (7/96)

IIS indexing process introduced (1998) Business taxonomy development (1998) 1998

ISG supports BU portals with indexed content (1999)

ISG ceases to produce custom sites (2000)

Pre-web

1994

1995

1996

1997

1999

2000

2001

2002

2003

And then another wierd thing happened -- the failing economy caught up with our CIO. The CIO organization has been decimated by forced management procedures (or layoffs) in the last year. So much of the hard core information systems / development work is returning to us in IIS again.

And everything old is new again
Technologies

Internet protocols (Archie, FTP, telnet)

NCSA Mosaic (11/ 1993)

Netscape Navigator 1 (12/1994)

Company Milestones

Simple sites proliferate; Hand editted and FTPed; Front Page webmasters (1995-96)

ONSource, first BU portal (1/1999)

Portals close; subdomains removed Migration to MyLucent begins (2000-2001)

MyLucent, Company portal (7/2001)

Much of CIO supporting MyLucent is laid off (2003)

IIS Milestones

LINUS (Client-server)

InfoView Digital Library (1994-1995)

ISG created; produces customized db-driven BU intranet sites (7/96)

IIS indexing process introduced (1998) Business taxonomy development (1998) 1998

ISG supports BU portals with indexed content (1999)

ISG ceases to produce custom sites (2000)

We are here
Blogs appear; Blog-related services (2002)

Pre-web

1994

1995

1996

1997

1999

2000

2001

2002

2003

Which brings us back to where we started really. We’re finding more people needing to create/and share knowledge who are using some form of lightweight web publishing to do it. But this time the technologies have matured and some of the savvy people are picking up light CMS in the form of weblogging software.

Really seems like web-publishing chaos

From an IT perspective, however, this appears to be chaos. We see different processes and technologies serving different types of people. But this patchwork image really tells a good story... As an aside, the CIO reaction to this chaos has been to start up large projects requiring a good deal of spending and to mandate the use of standard processes and technologies in the enterprise. From my perspective, it seems that not all of these processes and technologies have not always been coordinated with user processes and needs. As a result there might be a backlash of users backtracking of users to simpler methods. We’re starting to see this in the re-emergence of personal publishing (such as with weblogs) and with an increase in requests for information services of my organization.

There’s a story to be told in that diversity/chaos
Our intranet story can best be explained in terms of people Diverse set set of user types Diverse set of needs Diverse set of technologies used to meet these needs

Looking at our timeline, I think our Intranet story can best be explained in terms of the needs of the different user types within the company. and specifically by observing who these needs have been satisfied (or not satisfied) using various technologies over time. Looking back at that timeline from a high vantage point, it seems like IT infrastructure for web publishing is complete chaos. To some degree that's true. Until recently, all IT implementations within the company have been executed from within the individual business units rather than directed from above. Slowly that's changing again! We'll look more at that issue when we talk about information ecology. The issue that we're going to focus on first is this diversity of needs.

Who’s who in intranet web-publishing
Knowledge workers
Researchers, engineers, sales force

Communities of Practice (CoPs)
Communities organized around projects, products or topics (e.g. Mobility)

Chief Information Organization (CIO)
Enterprise Information Technology people

Executives
Officers, upper managers

We can generalize about who the major players in the Intranet web publishing picture, reducing those involved to a few key user types. Knowledge workers Communities of Practice Chief Information Organization Executives

Researchers, engineers, sales force

Knowledge worker

Groups organized around specific topics or projects

Community of practice

CIO directors and managers

CIO manager

Executive officers, vice presidents and upper management

Executive

"I need to capture and distribute information about companies and contracts I'm working with. Email and the telephone is usually the fastest way to do this, but I want to spread this information throughout the company."
Background Knowledge workers are closes to our products. Depending on their role and Business Unit, typical kowledge workers might do research related to products or prepare contracts for customers. One common need is to document and communicate knowledge within their department. Goals Create and share personal or project-related knowledge. Tools HTML editors (Front page, HomeSite, vi) and FTP, Weblog applications (Radio Userland, Movable Type), Email

"We increase our personal and group intelligence through collaboration and knowledgeshare."

"I satisfy my user demands using our standard toolbox of technologies. CIO works hard to establish the solutions that deliver the most bang for the buck"
Background The CIO organization is interested in establishing technical standards within the company. They are looking for ways to efficiently meet information systems needs in a manner that reflects the budget and resources of the company. Goals Meet user needs with standardized processes and information technologies. Tools Enterprise portals, document management systems

"In this tight, competitive economy, we have to operate at world-class levels, but we have to be lean and cut costs where we can.

Background Communities of practice are formed around topics, product lines, technologies, customers. CoPs are typically interested in collaboration and knowledge-share at a group level. Goals Increase awareness and knowledge on topic of interest to CoP for group or personal gain. Tools Document management systems, Groupware, BBS, Wiki, Email discussion groups

Background Upper management is focussed on keeping the company profitable. One of the popular ways to do this is to control costs. Management is looking to reduce duplication and leverage resources in order to cut back on unnecessary costs. Goals Lead organization to profitability.

This matrix hopes to tell the story of the key players in intranet web publishing and what their needs are. These are the people who: * need to create and share personally acquired knowledge --> these are the knowledge workers * who need to collaborate and share group or community knowledge (for example,. project and client information, product and contract details) --> these are the Communities of Practice * who need to establish a standard way of doing things (for instance, determining standard architecture and processes) --> this is the CIO organization * who need to find ways of saving money --> this is upper management, especially company executives The primary users of web publishing applications are the knowledge workers and communities of practice. These user needs have mapped to specific tools and technologies over our history. Describing their business goals and knowledge management needs would help the CIO to try to provide a road map for IT standards, but for my purposes, this matrix helps so we can observe how the different user needs have resulted in our present information ecology.

Diversity is a good thing
Nardi & O'Day on information ecology
A system of people, practices, values, and technologies at work in a local environment. A healthy ecology is one that is dynamic (changing/evolving), diverse (made up of different types of people and technologies) and that allows for a diverse set of people and technologies to work in a complementary way.

The new knowledge management
Deloitte: bridging the gaps between people and systems depends on first creating the conditions that allow people to participate in KM locally rather than enforcing technology-based KM policies. These local activities are bridged in Knowledge Network systems Forrester: organizations have begun to move away from single-solution KM packages

Diversity is a good thing. I began to realize over time that the perceived chaos didn’t represent a failure of IT or of management. The diversity, I think, simply represented a lot of people with information or knowledge management needs that were finding different ways of satisfying those needs. The implementations are as diverse as the people that make up the organization. I found in the writing of Bonnie Nardi and Vicki O’Day some justification for or validation of this diversity. Essentially they use the analogy of ecologies to describe the organization. They define an information ecology as a system of people, practices, values, and technologies at work in a local environment. They also say that a healthy ecology is one that is dynamic (changing/evolving), diverse (made up of different types of people and technologies) and that allows for a diverse set of people and technologies to work in a complementary way. I’ve used this opinion to rationalize our strategy of working with this diversity rather than imposing systems or processes from above. Some recent research by Deloitte Consulting and Forrester Research supports this.

Analysis of Lucent information ecology Diversity remains the only constant Our information ecology is diverse Our needs are diverse Our web publishing toolbox is diverse

Our web publishing tools
Database driven publishing -- CMS and document management Home grown and low-cost server-based publishing tools Desktop personal publishing tools Groupware applications

We've used a diiverse set of tools We’ve seen a lot of Database driven publishing -- Home grown or commercial CMS and document management. A lot of these applications in the past have been front ends for databases using scripting languages (for instance perl). Home grown and low-cost server-based applications such as weblogs are increasing in popularity. Desktop personal publishing tools remain popular. There are still a lot of people that maintain their own web pages by hand and using WYSIWYG editors (simple editors like vi or notepad or using Front Page) And finally we’ve seen Groupware Applications, such as * Lotus Notes * Wikis and bulletin boards for community collaboration

What do we do with this diversity? Maybe these diverse needs and approaches should be accepted The goal should be to work with this diverse set of user needs and technologies To find some way to glue the data together to make it usable

The first thing is maybe to accept that in a large organization people will often want to do things their own way. I am not against standard processes and procedures, but, I think the goal should be to work with the diverse user needs and technologies that are expressed and find a way to make them work together.

Enter the weblog
Let's step back a bit and talk about weblogs. They're the new up and comers in web publishing on the intranet. Weblogs are growing in popularity and there are lot of inexpensive weblog tools to choose from today.

Before we go into details about how to make the data from these web-published resources usable, let's step back a bit and talk about weblogs because they're the most recent arrivals to web publishing on the intranet. A lot of applications for web publishing have emerged over the last few years. Weblogging applications in particular are growing in popularity and there are many inexpensive weblogging tools to choose from today.

A quick look at what weblogs are
A web site (usu. of personal/non-commercial origin) that is frequently updated with information and links to resources within a particular subject area. The published information is presented much like a journal on the web in reverse chronological order. In 1999 Peter Merholz coins the term "Blog". Rebecca Blood. “weblogs: a history and perspective”. Rebecca’s pocket. Essay discussing the emergence of weblogs. http://www.rebeccablood.net/essays/ weblog_history.html

A weblog is a web site that is frequently updated with information and links to resources within a particular subject area. The published information is presented much like a journal on the web in reverse chronological order. Peter Merholz announced in early 1999 that he was going to pronounce it 'wee-blog' and inevitably this was shortened to 'blog' with the weblog editor referred to as a 'blogger.' You can read more about the history of weblogs by reading Rebecca Blood’s essay, “weblogs: a history and perspective.” She’s also written a book on what weblogs are titled “The weblog handbook”. She’s an example of one one of the bloggers out there who maintain a personal weblog, hers in particular devoted to writing about the literature and movies she devours.

What do people blog? Personal opinion Industry & topic specific information & opinion Very often meta discussion / revolve around specific web page content (URL) -- discussion about something some else has written about

So now that we know what weblogs are, what is it that people blog and why should you care? Well blogging started out as a form of personal journal writing that was just transferred from paper to the web. That still makes up the bulk of what bloggers blog. An example of this is Rebecca Blood’s blog, Rebecca’s Pocket. But a growing area of interest is in publishing opinion and commentary on an industry or subject area. An example of this is John Rhodes’ WebWord or the IA community blog iaslash. These sites also become community discussion areas when the weblog allows people to leave their comments. And a lot of the time, these are sites that just maintain a list of current articles and web sites within a subject area. Examples of this type of blog are Lawrence Lee’s Tomalak’s Realm.

Blogging also means sharing
Weblogs allow you to publish a news feed A news feed can be a data file listing recent entries from a weblog Or a data file listing recent news headlines from a commercial source. Blog feed formats are in XML format (specifically RSS or RDF) Look for these buttons:

Reading blogs in a news aggregator
Aside from being tools to publish and share, weblogs often offer a mechanism for reading other weblog data in XML feeds News readers / aggregators An application that retrieves and displays news feeds from multiple sources. Client application -- runs on PC for individual use. Server application – runs on a web server for group use.

We’ll talk a bit more about this later

Creating and publishing a blog entry
Usually HTML form based interface for creating each blog entry Really entering a simple database record Enter title Body of text Category (optional) Author (auto-entered) Date (auto-entered)

Weblogging is easy. Most of the tools available are simply HTML form-based interfaces for creating database records. The blog records are usually quite simple. You enter the title and body of your text. Optionally you can enter a category from a list of categories you’ve entered in your tool. The author and date are usually auto-entered.

Blogging variations
Variations of the process -- URL based blogging Since a lot of the time blog entries contain meta-discussion, the starting point is a pointer to an article someone else has written Blog from an aggregator Blog from a bookmarklet Let’s see how it works...

There are some variations as well. The starting point isn’t always a blank blog entry screen. Since blog entries are often opinion of other people’s writing, the starting point in a blog session might be a URL on a remote site. In this case, if for instance, you are reading someone else’s site, you can use a news reader to read someone else’s blog entry and then click a link to auto fill that blog entry into your own blog so you cannot annotate and comment on what the other person is talking about. Sounds complicated, but I’ll show you in a minute how this works. So let’s take a look at a few tools to show you what a blogging session is like.

Movable Type
1. Enter title

2. Enter body of blog entry

3. Select category 4. Publish

This is the typical web-based blog entry screen. In this example we’re seeing Movable Type, a weblogging tool written in Perl.

Click here to read comments

This screen shows the published blog. One of the nice features that most blogging applications allows readers to leave their comments. This example shows a link that indicates the number of comments attached to a blog entry. If you click that link...

The reader comments usually appear on the screen or in a separate browser window. Another neat feature like this using XML RPC[1] is called “TrackBack” -- feature allows people to reference the URL for this blog entry on their own weblog, and then their trackback ping appears on my weblog. [1] XML Remote Procedure Calling protocol. See Userland http://www.xmlrpc.com/

Enter the k-log
Sounds good, but why use them inside the intranet? In the current economy, some individuals are breaking away from traditional KM to do KM on a Budget. Low cost weblog tools are available to help with 2 core concerns of KM. Knowledge creation (publishing) Knowledge sharing (XML feed aggregators)

You might be thinking, “Sounds good, but why would I want to blog in the intranet?” The type of weblogs we’re seeing in the intranet are a special type called knowledge logs or k-logs. A few articles in the past year have discussed the advantage of using weblogging software to handle some aspects of knowledge management. The first advantage is that it offers a low cost alternative to doing knowledge management. I’ve read on discussion groups that a lot of people find weblog software appealing after seeing larger KM efforts fail. And weblogs help people to handle 2 core concerns of knowledge management: Knowledge creation and sharing

K-logging is about knowledge share
Bloggers are often subject area experts “Free-loading” on these experts helps grow the knowledge of individuals and the organization There’s advantage in belonging to a social network formed around research interests

Additionally, the people who are adopting this form of web publishing are savvy web users who read blogs outside of work and know that relying on other people’s whose expertise you trust is a great way of growing your own knowledge. Additionally, bloggers, I think find great advantage in being connected to others who share research interests. Social networks or societies of bloggers who read each other’s opinion often form and in this way, ideas are challenged and tested.

Fast, cheap and in total control
Fast (and easy): Set up is quick and doesn't require much expertise. Cheap: Powerful personal publishing solutions at low cost. In total control: The real power in weblogging is that it puts knowledge creation in your control and also allows you a standards-based mechanism for pushing/sharing that information.

The bottom line, I think, is that people on the intranet are using blogs for web publishing because 1) They’re quick and easy to setup. Most setup will cost you 15 to 30 minutes if your company has web space available for you. 2) They offer a great amount of functionality at very low cost. Some weblogging software is free. 3) And probably most importantly, weblogging puts knowledge creation and sharing in your hands. You don’t need to rely on the processes and technologies of anyone else to do this and the sharing mechanism uses standards based XML, which means that your data can be re-used elsewhere.

How we are supporting k-loggers
XML feeds of databases News data ABI/Inform Technical documents Almost any data set can be mapped to the standard RSS format Email discussion groups, CRM, directory of new personnel

So with the appearance of new weblogs in the intranet, my organization has begun to discuss how to support k-loggers. One area has to do with supporting their knowledge creation. The first blog user came to us asking for news feeds on specific topics. What we have done is to provide database search results in RSS format so she can do any complex or simple search for a topic she has in mind and then have a URL that will serve as a news feed that she can feed into a news reader or aggregator of her choosing.

What our users do with the RSS

This example shows part of a database search result page. This particular database is our Selected News database which pulls indexed content from published news sources via Factiva. In this search above, I entered terms “Classification, indexing and abstracting” as my query and the search results show a lot of records. Embedded within the search results is an option to view the results as XML (which is a dump of the search results contents with all fields) or in RSS (a dump of the results in a brief record format showing title, URL, and abstract for each record. Bloggers copy the URL for the RSS feed and can then use them in their own aggregators.

Adding the URL for the RSS feed to your news aggregator.

Here’s an example of how a user might follow a news feed. This example is using Radio Userland, a weblog publishing tool with an integrated feedreader. In the news aggregator, I have Hack the Planet as a website I’m following

In news aggregator view, user sees a story they want to blog

Selecting the POST button copies that story's URL and title to a new blog entry in the editing form.

So where do we go from here? Keep close watch of weblogs in the enterprise As weblogs profilerate, work on strategy for making blog data usable Talk to the bloggers who might benefit from sharing and finding

So now we’ve prepared our company to use the data we pull in daily from various news vendors and internal databases. Where do we go from here? I think already we’ve done more to support k-loggers than is expected, but we’re also hoping to support their efforts if weblogs start to proliferate in the company.

Consider your place in the ecology
The natural progression in an information ecology where k-loggers start to proliferate is to seek a system that pulls together the disparate k-log data. The role of the information services organization is to glue together the aggregate of produced k-log data for its users to consume.

The natural progression in an information ecology where k-loggers start to proliferate is to seek a system that pulls together the disparate k-log data. This is where the information services organization comes in. The XML feeds that k-loggers produce are almost always in one of a few standard RSS or RDF formats. This is the common element that allows the your organization to glue together the aggregate of produced k-log data for its users to consume. But you can do more to that mass of data to make it findable.

Collecting blog data
How do we do that? Use metadata to create rich bibliographic records of each entry (author, publisher, date, etc.) Included in the process of recording metadata is to help people make sense of that data by classifying it -- by topic/subject.

Obviously you need to first begin collecting and aggregating that data. So you will need a technical strategy for doing that. But once you’ve got the data, the real work is in making it findable. The first step is to use metadata to create bibliographic records for each entry. You can rely on a standard such as the Dublin Core metadata elements to help structure your own metadata schema. The next step is to consider some form of classification or organizing the blog entries by topic or subject.

Making blog data findable
Make the aggregate of collected blog entries available by publishing it Make searching and browsing of indexed blog entries possible Our organization already does a lot of the text parsing, classification and republishing that is needed to make a Blog aggregator fly Offer varying means of use and notification when new relevant data comes in. Email alerts, etc.

First you make the aggregate of blog entries available in a raw feed for re-use and also offer a reverse-chronologically sorted spool of recent blog entries. Next make it possible for people to search and browse the indexed blog entries in the collection. Our organization does a lot of text parsing, classification and republishing, and I’ll explain that process in the next slide. Finally offer other means of use and notification such as email alerts.

Our process
Start: Raw feed from various sources (vendor data, internal databases, weblogs) Data from external feeds are loaded into server Human indexers review auto-classification and correct or add index terms

Machine algorithms do some clustering and auto-classification using subject taxonomy

Classified data is stored

Finish: Classified data is served through web server requests

I’ve already had someone on a discussion group tell me to just throw Google at the data to make content findable. I don't know that search engines are always the answer to all problems. Yes search is necessary, but are search engines the front end you want to use for all types of databases? We do, in fact, have search engines in-house that do cluster analysis and offer categorized and relevancy ranked web site search results. But we aggregate a lot of data -- most not from websites -- and our process for indexing this data uses a combination of machine and human indexing. Computer algorithms have not proven to be capable of discerning some concepts as well as humans. But anyway, this is a high-level overview of what we do.

One way you could do it
The brute force method Use an aggregator (e.g. Radio or Drupal) Get humans to blog and classify the blogs

Web-based aggregator (brute-force) example

This screen shows the web-based aggregator built into the Drupal application, which we use on the IA community blog, iaslash. The “Latest news” shows the most recent blog entries collected from various blogs that we watch. From this page, we can select the news that’s relevant to our community and enter it into our database.

Web-based aggregator (brute-force) example

This is view of the home page, which shows how our database is displayed with classification shown below each entry.

Other ways you can do it
Just use search software with automated classification Consider our hybrid approach

And you can always rely on software to automatically do the classification of data for you. Many search vendors offer some sot of module that allows for this kind of classification, but often some human intervention is needed to help guide or tweak the classification module. Finally you can consider out hybrid, semi-automated approach. Our take is that their is a bigger return of quality indexing when you insert humans into the process.

Success
It’s important to note that the success of KM depends on the willingness of individuals to participate by using tools that will integrate seamlessly with the organization’s knowledge network. Deloitte suggests that while localized KM efforts may not require knowledge networks in small organizations, the advantage of knowledge networks becomes manifest when communities express the need to re-use that localized knowledge.

Sustainability
Another important factor is sustainability If you plan on doing automated classification of data human resources will be needed at some point to set up taxonomies If you plan on a hybrid machine and human aided indexing process, full time staff might be needed

Closing thoughts
Weblogs are really not different as a technology, although they put control of publishing closer to users Classifying weblog data can be difficult and requires human resources, but some search applications can help Value diversity and above all, support users’ needs Allow users to produce organizational knowledge using whatever tools they choose

While comments, trackbacks and XML feeds are useful, as a technology, weblogs are really not very different from other applications made for web publishing. What makes them different is how they’re used from the end-user perspective. It puts control of web publishing in the hands of end users, who can decide what process they want to use for sharing knowledge and what technology to use. Some people view this amount of control and power to publish as a danger if bloggers record too much information. However, I think one of the ways to produce organizational knowledge is to record all sorts of tacit knowledge including ephemeral communications, as well as meeting afterthoughts and opinions. Making the output from weblogs usable can be as simple as allowing your search engine to spider and index their content, but if your search application doesn’t allow for classification, the ability to browse content by attributes such as subject, business unit, and product, might not be possible. Classification allows the system to represent the knowledge contained in data more consistently. The caveats to doing classification are that neither an automated or a manual process can you give you best results. So the investment in doing classification might require additional time and resources if you don’t already have indexing staff available. Some search vendors offer classification, however, so this may be a good route to pursue if human resources aren’t available. Finally, I think it’s important to remember to keep in mind the needs of the users who want or need to blog and to encourage it if it results in knowledge sharing. While there are bound to be a lot of people who want to protect their intellectual capital, there are probably equal numbers of people who see value in sharing their knowledge, and if using a cheap and easy weblogging tool is how they find their way to doing that, that can’t be a bad thing.

Further reading on info. ecology and KM
Bonnie Nardi and Vicki O’Day. “Information Ecologies: Using Technology with Heart.” First Monday. http:// www.firstmonday.dk/issues/issue4_5/nardi_contents.html Robin Athey. “Collaborative Knowledge Networks: Driving Workforce Performance Through Web-enabled Communities”. Deloitte Consulting. http://www.dc.com/ Insights/research/cross_ind/ckn_workforce.asp Joshua Walker. “The New Knowledge Management Landscape”. Forrester Research. http:// www.forrester.com/ER/Research/Brief/Excerpt/ 0,1317,15338,00.html

Further reading about k-logs
John Foley. "Are You Blogging Yet?" InfoWorld. Discusses the value of using weblogs in the enterprise. http:// www.informationweek.com/story/IWK20020719S0001 David Weinberger. “The 99 cent KM solution”. KM World. http:// www.kmworld.com/publications/magazine/ index.cfm?action=readarticle&Article_ID=1337&Publication_ID=76 John Robb. “A simple approach to KM”. K-Logs discussion group. http://groups.yahoo.com/group/klogs/message/313 K-Logs discussion group. Email discussion group that discussions klogging for KM. http://groups.yahoo.com/group/klogs A Klog Apart. Phil Wolff's klog about klogging. http:// www.dijest.com/aka/

Where to get software
Weblog software “Weblog publishers.” Open Directory Project. http://dmoz.org/ Computers/Internet/On_the_Web/Weblogs/Tools/Publishers/ RSS news reader client software “News Readers.” Open Directory Project. http://dmoz.org/Reference/ Libraries/Library_and_Information_Science/Technical_Services/ Cataloguing/Metadata/RDF/Applications/RSS/News_Readers/ Server-based news aggregator software AmphetaDesk http://www.disobey.com/amphetadesk/ blagg. http://www.oreillynet.com/~rael/lang/perl/blagg/ Drupal. http://drupal.org Radio Userland. http://radio.userland.com/multiAuthorWeblogTool

Thank you :: The end

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer: Get 4 months of Scribd and The New York Times for just $1.87 per week!

Master Your Semester with a Special Offer from Scribd & The New York Times