they dealt with static data content. Forexample, let’s say Acme Shoes publisheda Web page containing the phone num-ber and address of each corporate of-fice. This information wasn’t likely tochange very often, so a manual, one-time conversion of the spreadsheet con-taining the contact information was pro b-ably sufficient to create a Web page.Although static content had its place inearly Web sites, today’s e-commerce sitesdeal with dynamic, real time data. For ex-ample, if you’re interested in bookingan airline flight, you’ll want real time in-f o rmation showing the airline’s latestprices, flight schedules and so on. E-c o m m e rce sites must rely on dynamicdata extraction methods to retrieve thistype of data, because it is constantlychanging.Most Internet start-ups generally won’thave issues with such real time data ex-traction, because they’ll most likely buildtheir Web sites using modern, Intern e t -a w a re tools. Traditional brick and mor-tar companies, on the other hand, havefaced significant challenges attemptingto interface their Web sites with legacys y s t e m s .Going back to the Acme Shoes exam-ple, suppose the company’s goal is to letusers purchase shoes online and check the availability of shoe styles, coloursand sizes before placing an ord e r. Un-fortunately, this type of information ishoused in a database on Acme’s 15-year-old mainframe that’s oblivious to the In-t e rnet. Acme’s challenge is to re t r i e v edynamic information, without re - d e-signing its existing legacy system. Ide-ally, the company would like to extracti n f o rmation from its mainframe systemand publish this information in a form a tusable by each new system under de-velopment.At its most fundamental level, thisp rocess breaks down to two basic tasks:collecting the information and distrib-uting it. The Internet community hasadopted two words to describe thesetasks: content aggregation and contentsyndication. When you apply XML’s in-f o rmation exchange prowess to both of these tasks, the result is a cross platformsolution that’s hard to beat.
AG G R E G ATION AND SYNDICAT I O N
A g g regation is the process of collecting,o rganising and combining data from dis-parate sources. This concept is often as-sociated with large Internet portals, suchas Excite and Yahoo! In reality, manyd i ff e rent types of sites deal with contenta g g regation. In fact, the demand for toolsthat simplify aggregation and syndicationhas increased so much that several pro d-ucts have surfaced that specialise in han-dling one or both of these tasks.One such product is Software A G’ sTamino (www.softwareagusa.com), anXML-based information server that pro-vides integrated access to existing lega-cy data, such as relational databases orO ffice files. Because Tamino is based onXML, unexpected changes in the form a tof a data stream can be processed basedon embedded metadata—or data aboutdata.Another is Metaphoria, from Inform a-tion Architects, which primarily supportsa g g regation and also handles some syn-dication. Metaphoria, written in server-based Java, organises re f e rences to datarather than the actual data itself. To ac-complish this, Metaphoria includes toolsthat collect and build a repository of metadata links. This repository is thenused to access and publish data as need-ed. Metaphoria’s aggregation agents gath-er input from a variety of sources, in-cluding databases, XML documents andHTML documents.Syndication is the process of dissemi-nating data. On the Internet, syndica-