You are on page 1of 52

Making sense of weblogs in the intranet

What they are, why people are using them, making them useful for
knowledge management

Usability Professionals Association


16 September 2003

Michael Angeles
michael@studioid.com
http://studioid.com

Thank you.

Today I’m going to talk about weblogs inside my company, their use in
knowledge management, and how my organization is hoping to make them
usable for enterprise knowledge work if the number of blogs in the company
increases significantly.

I’ll talk briefly about our company and the types of people involved in various
forms of web publishing on the intranet.

Then I’ll look more closely at what weblogs are, how people use them, and how
we might develop information systems to make usable, the data that gets
published from these weblogs.
Disclaimer

But, first... a disclaimer.

I like it when documents such as functional specifications start out with a


disclaimer -- a discussion of what it doesn’t cover as well as what it does.

I’d like to try to introduce this presentation in the same way so you know that
not everything I’m talking about has been implemented.

Much of this talk has to do with strategy and positioning.


What this is
A discussion about weblogging for knowledge
management within corporations
A discussion of my organization’s role -- how
we view ourselves in terms of providing
weblogging support
A look at our long view -- how we’re planning
to support webloggers

So here’s what this presentation is going to be...

This is going to be a discussion of a phenomenon occurring within corporations


* Namely the proliferation of intranet weblogs for knowledge management.
* I’m also going to talk a little about how weblogs affect corporate intelligence and IT.

This is also a discussion of how my organization has analyzed and is planning to deal with weblogs.
* I’m going to talk a little about how we’re supporting bloggers presently.
* And I’m going to talk about how we, as the company’s information management organization, are
positioning ourselves to deal with any information growth as a result of blogging.

The disclaimer part is that we have NOT implemented all of our ideas yet, though we have the technology and
resources to implement them. The technical implementation, as you will see is trivial when compared to the
strategy and resources required to actually pull off some of the ideas we’ve kicked about.
A history of web publishing in my intranet
Before we get into the nitty gritty of weblogs ...
a very brief and incomplete history of Lucent
intranet web publishing
How web publishing has evolved
Who’s needs are being met by web-based
publishing
Let’s start with a timeline

But before we get into the nitty gritty of what weblogs are and before I start throwing
out buzzwords

I want to give you some idea of how web publishing has evolved in our intranet

And then want to look briefly at the different people involved in publishing corporate
information on the intranet and why they need to do it
First there was the command line

Technologies
Internet NCSA
protocols Mosaic
(Archie, (11/
FTP, 1993)
telnet)

Company Milestones
LINUS

IIS Milestones
(Client-
server)

Pre-web 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003

Web browser timeline: http://www.blooberry.com/indexdot/history/browsers.htm

From a thousand miles up and with the benefit of hindsight, we can see where the
company has gone with web publishing on the intranet.

In pre-web days the library organization’s electronic resources were accessed using
LINUS, a shell interface that you accessed by telnetting into our UNIX server. This was
a hierarchical menu interface that dumped you into oru databases, which used
command-line search syntax identical to Dialog, a large database aggregator popular
with researchers.
Then came pictures

Technologies
Internet NCSA Netscape
protocols Mosaic Navigator 1
(Archie, (11/ (12/1994)
FTP, 1993)
telnet)

Company Milestones
Simple sites
proliferate;
Hand editted and FTPed;
Front Page webmasters
(1995-96)

LINUS InfoView ISG created;

IIS Milestones
(Client-server) Digital Library produces
(1994-1995) customized
db-driven
BU intranet
sites
(7/96)

Pre-web 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003

Then Tim Berners’ Lee wrote the specifications that became HTTP and HTML and the
web was born. Most web pages at this early stage of our intranet are all text and
almost all sites are probably marked up by hand in vi or emacs. Later people start to
use WYSIWYG editors like Front page.

In 1996 my organization begins to hire staff to produce web interfaces for customer
databases and web sites and we begin to get more heavily involved in doing db-driven
web-based information systems for business units.
Then useful data competed for screen space

Technologies
Internet NCSA Netscape
protocols Mosaic Navigator 1
(Archie, (11/ (12/1994)
FTP, 1993)
telnet)

Company Milestones
Simple sites ONSource,
proliferate; first BU
Hand editted and FTPed; portal
Front Page webmasters (1/1999)
(1995-96)

LINUS InfoView ISG created; IIS ISG

IIS Milestones
(Client-server) Digital Library produces indexing supports BU
(1994-1995) customized process portals with
db-driven BU introduced indexed
intranet sites (1998) content
(7/96) Business (1999)
taxonomy
development
(1998)

Pre-web 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003

Then useful data started to crowd and compete for screen space when the first business unit portals arrived.

ON Source is probably the most successful large-scale web site implementation I’ve seen in the company. It
was the result of an Optical Networking Group team that worked with analysts who came into the
organization to do interviews with Optical Networking knowledge workers to find out what they looked for to
do their jobs, where they looked and how much time they spent looking. After an extensive report was
created describing their prospective users and estimating the amount of money spent per person searching
for information, the functional specifications for this portal started to come together.

Our organization was brought in to develop a metadata schema including an Optical Networking subject
taxonomy, and company taxonomy which was then expanded to include all of the product and research
areas at Lucent.

IIS then began to modify its applications and indexing processes to incorporate these subject taxonomy terms
and classified data going through our organization began to feed the portal.
The bubble bursts and standards are born

Technologies
Internet NCSA Netscape
protocols Mosaic Navigator 1
(Archie, (11/ (12/1994)
FTP, 1993)
telnet)

Company Milestones
Simple sites ONSource, Portals close; MyLucent,
proliferate; first BU subdomains Company
Hand editted and FTPed; portal removed portal
Front Page webmasters (1/1999) Migration to (7/2001)
(1995-96) MyLucent
begins
(2000-2001)

LINUS InfoView ISG created; IIS indexing ISG supports ISG

IIS Milestones
(Client-server) Digital Library produces process BU portals ceases to
(1994-1995) customized introduced with indexed produce
db-driven BU (1998) content custom
intranet sites (1999) sites
(7/96) Business (2000)
taxonomy
development
(1998)

Pre-web 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003

Then a wierd thing happened. The bottom fell out when the dot com bubble burst. Telecom was hard hit and
from up high every executive and senior manager was looking for ways to cut costs.

So corporate standards were discussed for a long time and we began getting involved with an initiative to
migrate all of the company’s separate intranet sites into one company portal. I remember hearing about the
long meetings that seemed to go on for months around this topic.

In the end, the Oracle Portal server was selected and is now running the corporate intranet. My group
stopped doing custom-information services involving new web site development.
Then the bottom really falls out

Technologies
Internet NCSA Netscape
protocols Mosaic Navigator 1
(Archie, (11/ (12/1994)
FTP, 1993)
telnet)

Company Milestones
Simple sites ONSource, Portals close; MyLucent, Much of
proliferate; first BU subdomains Company CIO
Hand editted and FTPed; portal removed portal supporting
Front Page webmasters (1/1999) Migration to (7/2001) MyLucent
(1995-96) MyLucent is laid off
begins (2003)
(2000-2001)

LINUS InfoView ISG created; IIS indexing ISG supports ISG ceases

IIS Milestones
(Client-server) Digital Library produces process BU portals to produce
(1994-1995) customized introduced with indexed custom
db-driven BU (1998) content sites
intranet sites (1999) (2000)
(7/96) Business
taxonomy
development
(1998)

Pre-web 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003

And then another wierd thing happened -- the failing economy caught up with our CIO.

The CIO organization has been decimated by forced management procedures (or
layoffs) in the last year. So much of the hard core information systems / development
work is returning to us in IIS again.
And everything old is new again

Technologies
Internet NCSA Netscape
protocols Mosaic Navigator 1
(Archie, (11/ (12/1994)
FTP, 1993)
telnet)

Company Milestones
Simple sites ONSource, Portals close; MyLucent, Much of
proliferate; first BU subdomains Company CIO
Hand editted and FTPed; portal removed portal supporting
Front Page webmasters (1/1999) Migration to (7/2001) MyLucent
(1995-96) MyLucent is laid off
begins (2003)
(2000-2001)

LINUS InfoView ISG created; IIS indexing ISG supports ISG ceases

IIS Milestones
(Client-server) Digital Library produces process BU portals to produce We are here
(1994-1995) customized introduced with indexed custom Blogs appear;
db-driven BU (1998) content sites Blog-related
intranet sites (1999) (2000) services
(7/96) Business (2002)
taxonomy
development
(1998)

Pre-web 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003

Which brings us back to where we started really. We’re finding more people needing to
create/and share knowledge who are using some form of lightweight web publishing
to do it. But this time the technologies have matured and some of the savvy people
are picking up light CMS in the form of weblogging software.
Really seems like web-publishing chaos

From an IT perspective, however, this appears to be chaos.

We see different processes and technologies serving different types of people.

But this patchwork image really tells a good story...

As an aside, the CIO reaction to this chaos has been to start up large projects requiring a good deal of
spending and to mandate the use of standard processes and technologies in the enterprise. From my
perspective, it seems that not all of these processes and technologies have not always been coordinated with
user processes and needs. As a result there might be a backlash of users backtracking of users to simpler
methods. We’re starting to see this in the re-emergence of personal publishing (such as with weblogs) and
with an increase in requests for information services of my organization.
There’s a story to be told in that diversity/chaos

Our intranet story can best be explained in


terms of people
Diverse set set of user types
Diverse set of needs
Diverse set of technologies used to meet
these needs

Looking at our timeline, I think our Intranet story can best be explained in terms of the needs of the different
user types within the company.

and specifically by observing who these needs have been satisfied (or not satisfied) using various technologies
over time.

Looking back at that timeline from a high vantage point, it seems like IT infrastructure for web publishing is
complete chaos. To some degree that's true. Until recently, all IT implementations within the company have
been executed from within the individual business units rather than directed from above. Slowly that's changing
again! We'll look more at that issue when we talk about information ecology.

The issue that we're going to focus on first is this diversity of needs.
Who’s who in intranet web-publishing
Knowledge workers
Researchers, engineers, sales force

Communities of Practice (CoPs)


Communities organized around projects, products or
topics (e.g. Mobility)

Chief Information Organization (CIO)


Enterprise Information Technology people

Executives
Officers, upper managers

We can generalize about who the major players in the Intranet web publishing picture, reducing those involved
to a few key user types.

Knowledge workers
Communities of Practice
Chief Information Organization
Executives
Knowledge worker Community of practice CIO manager Executive
Researchers, engineers, sales force Groups organized around specific CIO directors and managers Executive officers, vice presidents
topics or projects and upper management

"I need to capture and


distribute information about "I satisfy my user demands
"In this tight, competitive
companies and contracts I'm "We increase our personal and using our standard toolbox of
economy, we have to operate
working with. Email and the group intelligence through technologies. CIO works hard
at world-class levels, but we
telephone is usually the fastest collaboration and knowledge- to establish the solutions that
have to be lean and cut costs
way to do this, but I want to share." deliver the most bang for the
where we can.
spread this information buck"
throughout the company."

Background Background Background Background


Knowledge workers are closes to Communities of practice are formed The CIO organization is interested Upper management is focussed on
our products. Depending on their around topics, product lines, in establishing technical standards keeping the company profitable.
role and Business Unit, typical technologies, customers. CoPs are within the company. They are One of the popular ways to do this
kowledge workers might do typically interested in collaboration looking for ways to efficiently meet is to control costs. Management is
research related to products or and knowledge-share at a group information systems needs in a looking to reduce duplication and
prepare contracts for customers. level. manner that reflects the budget and leverage resources in order to cut
One common need is to document resources of the company. back on unnecessary costs.
and communicate knowledge within Goals
their department. Increase awareness and knowledge Goals Goals
on topic of interest to CoP for Meet user needs with standardized Lead organization to profitability.
Goals group or personal gain. processes and information
Create and share personal or technologies.
project-related knowledge. Tools
Document management systems, Tools
Tools Groupware, BBS, Wiki, Email Enterprise portals, document
HTML editors (Front page, discussion groups management systems
HomeSite, vi) and FTP, Weblog
applications (Radio Userland,
Movable Type), Email

This matrix hopes to tell the story of the key players in intranet web publishing and what their needs are.

These are the people who:


* need to create and share personally acquired knowledge --> these are the knowledge workers
* who need to collaborate and share group or community knowledge (for example,. project and client
information, product and contract details) --> these are the Communities of Practice
* who need to establish a standard way of doing things (for instance, determining standard architecture and
processes) --> this is the CIO organization
* who need to find ways of saving money --> this is upper management, especially company executives

The primary users of web publishing applications are the knowledge workers and communities of practice.

These user needs have mapped to specific tools and technologies over our history. Describing their business
goals and knowledge management needs would help the CIO to try to provide a road map for IT standards,
but for my purposes, this matrix helps so we can observe how the different user needs have resulted in our
present information ecology.
Diversity is a good thing
Nardi & O'Day on information ecology
A system of people, practices, values, and technologies at work in a local
environment.

A healthy ecology is one that is dynamic (changing/evolving), diverse (made


up of different types of people and technologies) and that allows for a
diverse set of people and technologies to work in a complementary way.

The new knowledge management


Deloitte: bridging the gaps between people and systems depends on first
creating the conditions that allow people to participate in KM locally rather
than enforcing technology-based KM policies. These local activities are
bridged in Knowledge Network systems

Forrester: organizations have begun to move away from single-solution KM


packages

Diversity is a good thing.

I began to realize over time that the perceived chaos didn’t represent a failure of IT or of management. The diversity, I think,
simply represented a lot of people with information or knowledge management needs that were finding different ways of
satisfying those needs. The implementations are as diverse as the people that make up the organization.

I found in the writing of Bonnie Nardi and Vicki O’Day some justification for or validation of this diversity. Essentially they use the
analogy of ecologies to describe the organization.

They define an information ecology as a system of people, practices, values, and technologies at work in a local environment.

They also say that a healthy ecology is one that is dynamic (changing/evolving), diverse (made up of different types of people
and technologies) and that allows for a diverse set of people and technologies to work in a complementary way. I’ve used this
opinion to rationalize our strategy of working with this diversity rather than imposing systems or processes from above.

Some recent research by Deloitte Consulting and Forrester Research supports this.
Analysis of Lucent information ecology

Diversity remains the only constant


Our information ecology is diverse
Our needs are diverse
Our web publishing toolbox is
diverse
Our web publishing tools
Database driven publishing -- CMS and
document management
Home grown and low-cost server-based
publishing tools
Desktop personal publishing tools
Groupware applications

We've used a diiverse set of tools

We’ve seen a lot of Database driven publishing -- Home grown or commercial CMS and document
management. A lot of these applications in the past have been front ends for databases using scripting
languages (for instance perl).

Home grown and low-cost server-based applications such as weblogs are increasing in popularity.

Desktop personal publishing tools remain popular. There are still a lot of people that maintain their own
web pages by hand and using WYSIWYG editors (simple editors like vi or notepad or using Front Page)

And finally we’ve seen Groupware Applications, such as


* Lotus Notes
* Wikis and bulletin boards for community collaboration
What do we do with this diversity?

Maybe these diverse needs and


approaches should be accepted
The goal should be to work with this
diverse set of user needs and
technologies
To find some way to glue the data
together to make it usable

The first thing is maybe to accept that in a large organization people will often want to do things their own
way.

I am not against standard processes and procedures, but, I think the goal should be to work with the diverse
user needs and technologies that are expressed and find a way to make them work together.
Enter the weblog
Let's step back a bit and talk about weblogs.
They're the new up and comers in web
publishing on the intranet.

Weblogs are growing in popularity and there


are lot of inexpensive weblog tools to
choose from today.

Before we go into details about how to make the data from these web-published
resources usable, let's step back a bit and talk about weblogs because they're the
most recent arrivals to web publishing on the intranet.

A lot of applications for web publishing have emerged over the last few years.

Weblogging applications in particular are growing in popularity and there are many
inexpensive weblogging tools to choose from today.
A quick look at what weblogs are
A web site (usu. of personal/non-commercial origin)
that is frequently updated with information and links
to resources within a particular subject area.
The published information is presented much like a
journal on the web in reverse chronological order.
In 1999 Peter Merholz coins the term "Blog".
Rebecca Blood. “weblogs: a history and perspective”.
Rebecca’s pocket. Essay discussing the emergence
of weblogs. http://www.rebeccablood.net/essays/
weblog_history.html

A weblog is a web site that is frequently updated with information and links to resources within a particular
subject area.

The published information is presented much like a journal on the web in reverse chronological order.

Peter Merholz announced in early 1999 that he was going to pronounce it 'wee-blog' and inevitably
this was shortened to 'blog' with the weblog editor referred to as a 'blogger.'

You can read more about the history of weblogs by reading Rebecca Blood’s essay, “weblogs: a history and
perspective.” She’s also written a book on what weblogs are titled “The weblog handbook”. She’s an example
of one one of the bloggers out there who maintain a personal weblog, hers in particular devoted to writing
about the literature and movies she devours.
What do people blog?

Personal opinion
Industry & topic specific information &
opinion
Very often meta discussion / revolve
around specific web page content
(URL) -- discussion about something
some else has written about

So now that we know what weblogs are, what is it that people blog and why should you care?

Well blogging started out as a form of personal journal writing that was just transferred from paper to the web.
That still makes up the bulk of what bloggers blog. An example of this is Rebecca Blood’s blog, Rebecca’s
Pocket.

But a growing area of interest is in publishing opinion and commentary on an industry or subject area. An
example of this is John Rhodes’ WebWord or the IA community blog iaslash. These sites also become
community discussion areas when the weblog allows people to leave their comments.

And a lot of the time, these are sites that just maintain a list of current articles and web sites within a subject
area. Examples of this type of blog are Lawrence Lee’s Tomalak’s Realm.
Blogging also means sharing
Weblogs allow you to publish a news feed
A news feed can be a data file listing recent
entries from a weblog
Or a data file listing recent news headlines
from a commercial source.
Blog feed formats are in XML format
(specifically RSS or RDF)
Look for these buttons:
Reading blogs in a news aggregator
Aside from being tools to publish and share,
weblogs often offer a mechanism for reading
other weblog data in XML feeds
News readers / aggregators
An application that retrieves and displays
news feeds from multiple sources.
Client application -- runs on PC for
individual use.
Server application – runs on a web server
for group use.

We’ll talk a bit more about this later


Creating and publishing a blog entry
Usually HTML form based interface for
creating each blog entry
Really entering a simple database record
Enter title
Body of text
Category (optional)
Author (auto-entered)
Date (auto-entered)

Weblogging is easy. Most of the tools available are simply HTML form-based interfaces for creating database
records.

The blog records are usually quite simple.

You enter the title and body of your text. Optionally you can enter a category from a list of categories you’ve
entered in your tool. The author and date are usually auto-entered.
Blogging variations
Variations of the process -- URL based
blogging
Since a lot of the time blog entries contain
meta-discussion, the starting point is a
pointer to an article someone else has
written
Blog from an aggregator
Blog from a bookmarklet
Let’s see how it works...

There are some variations as well. The starting point isn’t always a blank blog entry
screen.

Since blog entries are often opinion of other people’s writing, the starting point in a
blog session might be a URL on a remote site. In this case, if for instance, you are
reading someone else’s site, you can use a news reader to read someone else’s blog
entry and then click a link to auto fill that blog entry into your own blog so you cannot
annotate and comment on what the other person is talking about. Sounds
complicated, but I’ll show you in a minute how this works.

So let’s take a look at a few tools to show you what a blogging session is like.
Movable Type

1. Enter title

2. Enter body
of blog entry

3. Select category

4. Publish

This is the typical web-based blog entry screen. In this example we’re seeing Movable Type, a weblogging
tool written in Perl.
Click here to read comments

This screen shows the published blog.

One of the nice features that most blogging applications allows readers to leave their comments.

This example shows a link that indicates the number of comments attached to a blog entry. If you click that
link...
The reader comments usually appear on the screen or in a separate browser window.

Another neat feature like this using XML RPC[1] is called “TrackBack” -- feature allows people to reference
the URL for this blog entry on their own weblog, and then their trackback ping appears on my weblog.

[1] XML Remote Procedure Calling protocol. See Userland http://www.xmlrpc.com/


Enter the k-log
Sounds good, but why use them
inside the intranet?

In the current economy, some


individuals are breaking away
from traditional KM to do KM on
a Budget.

Low cost weblog tools are


available to help with 2 core
concerns of KM.

Knowledge creation
(publishing)

Knowledge sharing (XML


feed aggregators)

You might be thinking, “Sounds good, but why would I want to blog in the intranet?”

The type of weblogs we’re seeing in the intranet are a special type called knowledge logs or k-logs.

A few articles in the past year have discussed the advantage of using weblogging software to handle some
aspects of knowledge management.

The first advantage is that it offers a low cost alternative to doing knowledge management. I’ve read on
discussion groups that a lot of people find weblog software appealing after seeing larger KM efforts fail.

And weblogs help people to handle 2 core concerns of knowledge management:

Knowledge creation and sharing


K-logging is about knowledge share
Bloggers are often subject area experts
“Free-loading” on these experts helps grow
the knowledge of individuals and the
organization
There’s advantage in belonging to a social
network formed around research interests

Additionally, the people who are adopting this form of web publishing are savvy web users who read blogs
outside of work and know that relying on other people’s whose expertise you trust is a great way of growing
your own knowledge.

Additionally, bloggers, I think find great advantage in being connected to others who share research interests.
Social networks or societies of bloggers who read each other’s opinion often form and in this way, ideas are
challenged and tested.
Fast, cheap and in total control
Fast (and easy): Set up is quick and doesn't
require much expertise.
Cheap: Powerful personal publishing solutions
at low cost.
In total control: The real power in weblogging
is that it puts knowledge creation in your
control and also allows you a standards-based
mechanism for pushing/sharing that
information.

The bottom line, I think, is that people on the intranet are using blogs for web publishing because

1) They’re quick and easy to setup. Most setup will cost you 15 to 30 minutes if your company has web space
available for you.

2) They offer a great amount of functionality at very low cost. Some weblogging software is free.

3) And probably most importantly, weblogging puts knowledge creation and sharing in your hands. You don’t
need to rely on the processes and technologies of anyone else to do this and the sharing mechanism uses
standards based XML, which means that your data can be re-used elsewhere.
How we are supporting k-loggers
XML feeds of databases
News data
ABI/Inform
Technical documents
Almost any data set can be mapped to the
standard RSS format
Email discussion groups, CRM, directory of
new personnel

So with the appearance of new weblogs in the intranet, my organization has begun to discuss how to support
k-loggers.

One area has to do with supporting their knowledge creation.

The first blog user came to us asking for news feeds on specific topics. What we have done is to provide
database search results in RSS format so she can do any complex or simple search for a topic she has in
mind and then have a URL that will serve as a news feed that she can feed into a news reader or aggregator
of her choosing.
What our users do with the RSS

This example shows part of a database search result page. This particular database is our Selected News
database which pulls indexed content from published news sources via Factiva.

In this search above, I entered terms “Classification, indexing and abstracting” as my query and the search
results show a lot of records.

Embedded within the search results is an option to view the results as XML (which is a dump of the search
results contents with all fields) or in RSS (a dump of the results in a brief record format showing title, URL, and
abstract for each record.

Bloggers copy the URL for the RSS feed and can then use them in their own aggregators.
Adding the URL for the RSS feed
to your news aggregator.

Here’s an example of how a user might follow a news feed. This example is using Radio
Userland, a weblog publishing tool with an integrated feedreader.

In the news aggregator, I have Hack the Planet as a website I’m following
In news aggregator view, user sees a story they want to blog

Selecting the POST button copies that story's URL and title to a new blog entry in the editing form.
So where do we go from here?

Keep close watch of weblogs in the


enterprise
As weblogs profilerate, work on
strategy for making blog data usable
Talk to the bloggers who might benefit
from sharing and finding

So now we’ve prepared our company to use the data we pull in daily from various news
vendors and internal databases. Where do we go from here?

I think already we’ve done more to support k-loggers than is expected, but we’re also
hoping to support their efforts if weblogs start to proliferate in the company.
Consider your place in the ecology
The natural progression in an information
ecology where k-loggers start to proliferate is
to seek a system that pulls together the
disparate k-log data.
The role of the information services
organization is to glue together the aggregate
of produced k-log data for its users to
consume.

The natural progression in an information ecology where k-loggers start to proliferate is


to seek a system that pulls together the disparate k-log data. This is where the
information services organization comes in.

The XML feeds that k-loggers produce are almost always in one of a few standard RSS
or RDF formats. This is the common element that allows the your organization to glue
together the aggregate of produced k-log data for its users to consume.

But you can do more to that mass of data to make it findable.


Collecting blog data
How do we do that?
Use metadata to create rich bibliographic
records of each entry (author, publisher,
date, etc.)
Included in the process of recording
metadata is to help people make sense of
that data by classifying it -- by topic/subject.

Obviously you need to first begin collecting and aggregating that data. So you will
need a technical strategy for doing that. But once you’ve got the data, the real work is
in making it findable.

The first step is to use metadata to create bibliographic records for each entry. You can
rely on a standard such as the Dublin Core metadata elements to help structure your
own metadata schema.

The next step is to consider some form of classification or organizing the blog entries by
topic or subject.
Making blog data findable
Make the aggregate of collected blog entries
available by publishing it
Make searching and browsing of indexed blog
entries possible
Our organization already does a lot of the text
parsing, classification and republishing that is
needed to make a Blog aggregator fly
Offer varying means of use and notification when
new relevant data comes in. Email alerts, etc.

First you make the aggregate of blog entries available in a raw feed for re-use and also
offer a reverse-chronologically sorted spool of recent blog entries.

Next make it possible for people to search and browse the indexed blog entries in the
collection. Our organization does a lot of text parsing, classification and republishing,
and I’ll explain that process in the next slide.

Finally offer other means of use and notification such as email alerts.
Our process
Start: Raw feed from various sources
(vendor data, internal databases, weblogs)

Data from external feeds


are loaded into server
Human indexers review auto-classification
and correct or add index terms

Machine algorithms do some


clustering and auto-classification
using subject taxonomy

Classified data is stored

Finish: Classified data is served through


web server requests

I’ve already had someone on a discussion group tell me to just throw Google at the data to make content
findable.

I don't know that search engines are always the answer to all problems. Yes search is necessary, but are
search engines the front end you want to use for all types of databases?

We do, in fact, have search engines in-house that do cluster analysis and offer categorized and relevancy
ranked web site search results. But we aggregate a lot of data -- most not from websites -- and our process
for indexing this data uses a combination of machine and human indexing. Computer algorithms have not
proven to be capable of discerning some concepts as well as humans.

But anyway, this is a high-level overview of what we do.


One way you could do it
The brute force method
Use an aggregator (e.g. Radio or Drupal)
Get humans to blog and classify the blogs
Web-based aggregator (brute-force) example

This screen shows the web-based aggregator built into the Drupal application, which we use on the IA
community blog, iaslash.

The “Latest news” shows the most recent blog entries collected from various blogs that we watch. From this
page, we can select the news that’s relevant to our community and enter it into our database.
Web-based aggregator (brute-force) example

This is view of the home page, which shows how our database is displayed with classification shown below
each entry.
Other ways you can do it
Just use search software with automated
classification
Consider our hybrid approach

And you can always rely on software to automatically do the classification of data for you. Many search
vendors offer some sot of module that allows for this kind of classification, but often some human intervention
is needed to help guide or tweak the classification module.

Finally you can consider out hybrid, semi-automated approach. Our take is that their is a bigger return of
quality indexing when you insert humans into the process.
Success
It’s important to note that the success of KM
depends on the willingness of individuals to
participate by using tools that will integrate
seamlessly with the organization’s knowledge
network.
Deloitte suggests that while localized KM
efforts may not require knowledge networks in
small organizations, the advantage of
knowledge networks becomes manifest when
communities express the need to re-use that
localized knowledge.
Sustainability
Another important factor is sustainability
If you plan on doing automated classification
of data human resources will be needed at
some point to set up taxonomies
If you plan on a hybrid machine and human
aided indexing process, full time staff might be
needed
Closing thoughts
Weblogs are really not different as a
technology, although they put control of
publishing closer to users
Classifying weblog data can be difficult and
requires human resources, but some search
applications can help
Value diversity and above all, support users’
needs
Allow users to produce organizational
knowledge using whatever tools they choose

While comments, trackbacks and XML feeds are useful, as a technology, weblogs are really not very different from other applications made for
web publishing.

What makes them different is how they’re used from the end-user perspective. It puts control of web publishing in the hands of end users, who
can decide what process they want to use for sharing knowledge and what technology to use.

Some people view this amount of control and power to publish as a danger if bloggers record too much information. However, I think one of the
ways to produce organizational knowledge is to record all sorts of tacit knowledge including ephemeral communications, as well as meeting
afterthoughts and opinions.

Making the output from weblogs usable can be as simple as allowing your search engine to spider and index their content, but if your search
application doesn’t allow for classification, the ability to browse content by attributes such as subject, business unit, and product, might not be
possible. Classification allows the system to represent the knowledge contained in data more consistently.

The caveats to doing classification are that neither an automated or a manual process can you give you best results. So the investment in doing
classification might require additional time and resources if you don’t already have indexing staff available. Some search vendors offer
classification, however, so this may be a good route to pursue if human resources aren’t available.

Finally, I think it’s important to remember to keep in mind the needs of the users who want or need to blog and to encourage it if it results in
knowledge sharing. While there are bound to be a lot of people who want to protect their intellectual capital, there are probably equal numbers of
people who see value in sharing their knowledge, and if using a cheap and easy weblogging tool is how they find their way to doing that, that
can’t be a bad thing.
Further reading on info. ecology and KM
Bonnie Nardi and Vicki O’Day. “Information Ecologies:
Using Technology with Heart.” First Monday. http://
www.firstmonday.dk/issues/issue4_5/nardi_contents.html

Robin Athey. “Collaborative Knowledge Networks:


Driving Workforce Performance Through Web-enabled
Communities”. Deloitte Consulting. http://www.dc.com/
Insights/research/cross_ind/ckn_workforce.asp

Joshua Walker. “The New Knowledge Management


Landscape”. Forrester Research. http://
www.forrester.com/ER/Research/Brief/Excerpt/
0,1317,15338,00.html
Further reading about k-logs
John Foley. "Are You Blogging Yet?" InfoWorld. Discusses the
value of using weblogs in the enterprise. http://
www.informationweek.com/story/IWK20020719S0001

David Weinberger. “The 99 cent KM solution”. KM World. http://


www.kmworld.com/publications/magazine/
index.cfm?action=readarticle&Article_ID=1337&Publication_ID=76

John Robb. “A simple approach to KM”. K-Logs discussion group.


http://groups.yahoo.com/group/klogs/message/313

K-Logs discussion group. Email discussion group that discussions


klogging for KM. http://groups.yahoo.com/group/klogs

A Klog Apart. Phil Wolff's klog about klogging. http://


www.dijest.com/aka/
Where to get software
Weblog software
“Weblog publishers.” Open Directory Project. http://dmoz.org/
Computers/Internet/On_the_Web/Weblogs/Tools/Publishers/

RSS news reader client software


“News Readers.” Open Directory Project. http://dmoz.org/Reference/
Libraries/Library_and_Information_Science/Technical_Services/
Cataloguing/Metadata/RDF/Applications/RSS/News_Readers/

Server-based news aggregator software

AmphetaDesk http://www.disobey.com/amphetadesk/

blagg. http://www.oreillynet.com/~rael/lang/perl/blagg/

Drupal. http://drupal.org

Radio Userland. http://radio.userland.com/multiAuthorWeblogTool


Thank you :: The end

You might also like