file:///media/T hunder/home/patrick/personal/writing...

SAX2, a good example of Visitor pattern
patrick@intervideo.com $Date: 2000-09-18 03:38:58-07 $ $Revision: 1.2 $ $State: Exp $ Summary: In addition to DOM, SAX2 is another simple alternative to processing XML documents. The design of SAX2 are based on the Visitor pattern. This article introduces SAX2 briefly and explains its relation to Visitor pattern. Note :

What Is SAX2?
Have you eve thought that parsing XML documents using DOM (Document Object Model) is a nightmare ? Here is the remedy : SAX2, the Simple API for XML, a fast, low-memory alternative to processing XML documents. SAX2 is a push-model parser. In other words, you provide the handlers and the parser calls them when a particular event occurs, such as the start of a document or the start or end of an element. The SAX2 parser generates several categories of events, including events that occur in the content of the XML document, events that occur in the DTD, and error events. To handle these events, you implement a corresponding handler class that contains methods to process the appropriate events. Note that you only need to implement handlers for those events you wish to process. If a handler is not implemented for a specific type of event, the event is simply ignored. The following is a very simple command-line application that reads an XML file and prints the file's tags to the console window (There seems to be some errors in the example code). The application implements only the content handler. The application consists of the following files: MyContent.h—header file for the content handler. MyContent.cpp—implementation of the content handler. TestSax.cpp—the "command-line" console application.

1 of 6

08/16/2008 10:31 PM

file:///media/T hunder/home/patrick/personal/writing...

// TestSax.cpp #include "stdafx.h" // Again, you need headers.

#include "stdio.h" // This is needed only to print something. #include "MyContent.h" // References to SAX are hidden here int main(int argc, char* argv[]) // Start! { CoInitialize(NULL); // Some magic to start COM. You may want to // use CoInitializeEx instead. ISAXXMLReader* pRdr = NULL; // Create parser. (A bit more magic...) HRESULT hr = CoCreateInstance(__uuidof(SAXXMLReader), NULL, CLSCTX_ALL, __uuidof(ISAXXMLReader), (void **)&pRdr); if(!FAILED(hr)) { MyContent * pMc = new MyContent();

hr = pRdr->putContentHandler(pMc);

// // // // //

Set your own content handler (and other handlers as well). And in real life, check this hr!

static wchar_t URL[1000]; mbstowcs( URL, argv[1], 999 ); hr = pRdr->parseURL(URL);

// ParseURL expects Unicode string, // argv[1] is ASCII. // Verify that you have argv[1].

// and parse!

pRdr->Release(); // Now just some cleanup work... } else { printf("\nUh-oh... %08X\n\n", hr); // Hopefully this will not // happen, but let's be ready. } CoUninitialize(); return 0; } // And finally, again, some magic to // uninitialize COM.

// MyContent.h #import <msxml3.dll> raw_interfaces_only // We use this library. using namespace MSXML2; // and everything for SAX is in this namespace class MyContent : public ISAXContentHandler { public: MyContent(); // Define constructor and destructor.

2 of 6

08/16/2008 10:31 PM

file:///media/T hunder/home/patrick/personal/writing...

virtual ~MyContent(); // Copy all methods from ISAXContentHandler. // interface... virtual HRESULT STDMETHODCALLTYPE startDocument(void); virtual HRESULT STDMETHODCALLTYPE endDocument(void); virtual HRESULT STDMETHODCALLTYPE startPrefixMapping(const wchar_t__RPC_FAR *pwchPrefix,int cchPrefix,const wchar_t __RPC_FAR *pwchUri,int cchUri); // ...and the underlying IUnknown interface... long __stdcall QueryInterface(const struct _GUID &,void ** ); unsigned long __stdcall AddRef(void); unsigned long __stdcall Release(void); // ...and add whatever you like to simplify implementation. private: void prt ( wchar_t * pwchFmt, const wchar_t __RPC_FAR *pwchVal, int cchVal); };

// MyContent.cpp #include #include #include #include "stdafx.h" // We need the headers... <stdio.h> <stdlib.h> "MyContent.h"

// Usually there is nothing to do in constructors and destructors, // but if so, do it here. MyContent::MyContent() {} MyContent::~MyContent() {} // Now finish the IUnknown stuff. // (However, keep in mind that if you want to implement handlers as COM // objects, you can add more functionality than shown with these // methods.) long __stdcall MyContent::QueryInterface(const struct _GUID &,void ** ) { return 0; } unsigned long __stdcall MyContent::AddRef() { return 0; } unsigned long __stdcall MyContent::Release() { return 0; } // Now get down to business. // First decide which events you want and which you don't. // It's simple for methods you don't want: HRESULT STDMETHODCALLTYPE MyContent::startDocument() { return S_OK; // Return S_OK to continue. // Any error return code will abort parsing. }

3 of 6

08/16/2008 10:31 PM

file:///media/T hunder/home/patrick/personal/writing...

// And for events you want, do whatever you want! HRESULT STDMETHODCALLTYPE MyContent::startElement( /* [in] */ wchar_t __RPC_FAR *pwchNamespaceUri, /* [in] */ int cchNamespaceUri, /* [in] */ wchar_t __RPC_FAR *pwchLocalName, /* [in] */ int cchLocalName, /* [in] */ wchar_t __RPC_FAR *pwchRawName, /* [in] */ int cchRawName, /* [in] */ ISAXAttributes __RPC_FAR *pAttributes) { // I want to print the tag name. prt(L"\n<%s>",pwchLocalName,cchLocalName); return S_OK; } // "prt" is a private method. // SAX does not use it. You don't have to implement it or any other // private methods. // This one is just quick print. void MyContent::prt ( wchar_t * pwchFmt, const wchar_t __RPC_FAR *pwchVal, int cchVal ) { static wchar_t val[1000]; cchVal = cchVal>999 ? 999 : cchVal; wcsncpy( val, pwchVal, cchVal ); val[cchVal] = 0; wprintf(pwchFmt,val); }

Does it seem similar to you ? What pattern have you found in this example ? Yes, the Visitor pattern. Visitor pattern uses the double dispatching technique to separate the traversing logic and the processing logic. The traversing logic lies in implementation of ISAXXMLReader and the processing logic lies in the implementation of the ISAXContentHandler. Let us illustrate this example by UML diagram:

4 of 6

08/16/2008 10:31 PM

file:///media/T hunder/home/patrick/personal/writing...

The Visitor pattern is definitely good when some simple traversing and processing is needed. On the other hands, if you want to do some complex editing, Visitor is probably not a good choice because the visitor is a passive role.

Reference
Design Pattern Thinking In Java, Bruce Eckel SAX2 Jumpstart for XML Developers, Eldar A. Musayev How to traverse DirectShow Graph using Visitor pattern, Tsai Ying-Hau - I will add this link when I finish it. :-)

Revision History
$Log: SAX2\040a\040good\040example\040of\040Visitor\040pattern.htm,v $ Revision 1.2 2000-09-18 03:38:58-07 patrick + Complete summary + SAX brief introduction + Add pattern illustration diagram + More explanations about visitor pattern + One more reference link Revision 1.1 2000-09-18 02:28:24-07 patrick

5 of 6

08/16/2008 10:31 PM

file:///media/T hunder/home/patrick/personal/writing...

* Reformat Revision 1.0 2000-09-17 20:26:34-07 Initial revision patrick

6 of 6

08/16/2008 10:31 PM