Collaboration and Content Strategies

In-Depth Research Overview

What's Up, .DOC? ODF, OOXML, and the Revolutionary Implications of XML in Productivity Applications
Version: 1.0, Jan 11, 2008

AUTHOR(S):
Peter O'Kelly
(pokelly@burtongroup.com)

Guy Creese
(gcreese@burtongroup.com)

TECHNOLOGY THREAD:
Content Management

Conclusion
The OpenDocument Format (ODF)/Office Open XML (OOXML) debate is part of a significant phase in the evolution of productivity application, with the shift to Extensible Markup Language (XML) file formats displacing traditional binary and proprietary file formats. The stakes are huge, with compelling new opportunities for content management, as well as both opportunities and challenges for software vendors. Organizations will gain important benefits by exploiting opportunities to improve information management and reduce vendor dependencies by shifting to XML file formats.

47792

Page: 1

Publishing Information
Burton Group is a research and consulting firm specializing in network and applications infrastructure technologies. Burton works to catalyze change and progress in the network computing industry through interaction with leading vendors and users. Publication headquarters, marketing, and sales offices are located at: Burton Group 7090 Union Park Center, Suite 200 Midvale, Utah USA 84047-4169 Phone: +1.801.566.2880 Fax: +1.801.566.3611 Toll free in the USA: 800.824.9924 Internet: info@burtongroup.com; www.burtongroup.com Copyright 2007 Burton Group. ISSN 1048-4620. All rights reserved. All product, technology and service names are trademarks or service marks of their respective owners. Terms of Use: Burton customers can freely copy and print this document for their internal use. Customers can also excerpt material from this document provided that they label the document as Proprietary and Confidential and add the following notice in the document: Copyright © 2007 Burton Group. Used with the permission of the copyright holder. Contains previously developed intellectual property and methodologies to which Burton Group retains rights. For internal customer use only. Requests from non-clients of Burton for permission to reprint or distribute should be addressed to the Client Services Department at +1.801.304.8174. Burton Group's Collaboration and Content Strategies service provides objective analysis of networking technology, market trends, vendor strategies, and related products. The information in Burton Group's Collaboration and Content Strategies service is gathered from reliable sources and is prepared by experienced analysts, but it cannot be considered infallible. The opinions expressed are based on judgments made at the time, and are subject to change. Burton offers no warranty, either expressed or implied, on the information in Burton Group's Collaboration and Content Strategies service, and accepts no responsibility for errors resulting from its use.

If you do not have a license to Burton Group's Collaboration and Content Strategies service and are interested in receiving information about becoming a subscriber, please contact Burton Group.

Table Of Contents
Synopsis.......................................................................................................................................................................... 4 Analysis...........................................................................................................................................................................5 What's In a Name?: Innuendo..................................................................................................................................... 5 Productivity Application Market Dynamics............................................................................................................... 6 Productivity Application Domains..........................................................................................................................6 Productivity Application Model Concerns............................................................................................................. 7 Content................................................................................................................................................................ 7 Presentation......................................................................................................................................................... 7 Structure.............................................................................................................................................................. 7 Behavior.............................................................................................................................................................. 7 Application Settings............................................................................................................................................ 8 Packages..............................................................................................................................................................8 Templates............................................................................................................................................................ 8 Productivity Application Workflow........................................................................................................................8 The Software as a Service Shift.............................................................................................................................. 9 Recap: New Realities and Requirements.............................................................................................................. 11 ODF, OOXML, and Other Document Models......................................................................................................... 13 OpenDocument Format.........................................................................................................................................13 OOXML................................................................................................................................................................ 14 Other Important Document Models...................................................................................................................... 15 File Format Translators: Actual Results May Vary . . ......................................................................................... 16 The Parallel Universe of the World Wide Web.................................................................................................... 16 Reciprocal Benefits: Standards and Standards Procedures...................................................................................17 Recap: File Format Form Follows Function......................................................................................................... 18 Projections.................................................................................................................................................................18 OOXML Will Be Successful................................................................................................................................ 18 Microsoft Will Aggressively Compete but Also Play Well with Others on OOXML......................................... 19 ODF Will Continue, Albeit in a Relatively Minor Role....................................................................................... 19 The W3C Model Will Prevail in Many Domains................................................................................................. 20 PDF Will Continue to Dominate Nonrevisable Document Contexts................................................................... 21 New Vendor Challenges and Opportunities..........................................................................................................21 Recommendations..................................................................................................................................................... 21 Distinguish Between Application/Service and Format Decisions........................................................................ 21 Discount the Political FUD (Fear, Uncertainty, and Doubt).................................................................................22 Require Vendors to Sincerely Commit and Contribute to Related Standards...................................................... 22 Exploit OOXML................................................................................................................................................... 23 For Now, Use ODF by Exception Rather than by Default................................................................................... 23 Get Out of the (Traditional) Office More Often................................................................................................... 23 The Details.................................................................................................................................................................... 24 An OOXML Example............................................................................................................................................... 24 An ODF Example......................................................................................................................................................29 OOXML Resources...................................................................................................................................................31 ODF Resources......................................................................................................................................................... 33 Conclusion.................................................................................................................................................................... 35 Notes............................................................................................................................................................................. 36 Author Bio ....................................................................................................................................................................37

3
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Synopsis
The software industry has rarely seen debates as intense as those surrounding OpenDocument Format (ODF) and Office Open XML (officially “Ecma 376 Office Open XML” [also known as ECMA-376 and OOXML]) during recent years. It's a story that has many elements appropriate for a James Bond movie, with multibillion dollar business empires at risk, global political intrigue, and even some conspiracy theories at the intersection of capitalism (commercial software products), democracy (industry standards), and communism (e.g., related standards controlled by the People's Republic of China). This is improbably heady stuff for what's ultimately a debate about something as mundane as file formats. The ODF/OOXML debate is significant in part because it will have a major influence on the future success of Microsoft Office, one of Microsoft's largest and most profitable product families. If ODF, the primary rival to the Microsoft-sponsored OOXML format, prevails, ODF-based productivity application suites, including OpenOffice.org-derived products such as IBM Lotus Symphony, the Novell Edition of OpenOffice.org, and Sun StarOffice, may gain market momentum at the expense of Microsoft Office. The productivity application market shift to Extensible Markup Language (XML) file formats may also expand opportunities for software as a service (SaaS) alternatives such as Google Apps. Standards created by the World Wide Web Consortium (W3C) play a subtly significant role in the ODF/OOXML debate. While few people think of the W3C in productivity application domains today, both ODF and OOXML leverage W3C standards, and it's possible the productivity application model will hit a tipping point, with a focus on webpages containing document, spreadsheet, and presentation components displacing the traditional standalone productivity application file model. From an enterprise perspective, the shift to XML file formats presents new opportunities for improved information management and reduced vendor lock-in. To exploit the emerging opportunities, organizations must understand how and why productivity applications have radically changed over the last decade. They must also undertake a mindset shift from application file management to XML content component management, with productivity applications serving more as specialized editors than stand-alone quasi-platforms. It's also critically important to provide tool selection guidance to information workers, to reduce the potential for defaulting to email messages with productivity application file attachments.

4
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Analysis
The recent industry debate about OpenDocument Format (ODF) and Office Open XML (OOXML) often comes down to the blunt question, “Which one will lead?” There are three answers. The first answer is, “It depends on who you are.” On one hand, government agencies and other organizations seeking to use a free, non-Microsoft productivity suite will be happy to use ODF, the file format behind OpenOffice.org (and derivatives such as IBM Lotus Symphony). On the other hand, libraries and large businesses, faced with storing and using years of Microsoft Office legacy documents, will prefer OOXML, as OOXML can more faithfully recreate the look and metadata (such as spreadsheet formulas) stored in Microsoft's binary file formats. The second answer is, “Within the larger market, OOXML will lead,” for three reasons. First, many enterprises are not that caught up in the standards debate; they just want to use what works for their needs. Microsoft Office 2007 defaults to storing documents in OOXML format, so, by migrating to Office 2007, many companies will let Microsoft make the decision for them. Second, OOXML is an extensible standard. It allows vendors and enterprises to extend the standard within an OOXML-defined framework. For example, the .XLSM file format, used to support a Microsoft Office 2007 Excel macro-enabled workbook, is not part of the base OOXML standard, but rather a Microsoft-created extension. This built-in ability to augment the OOXML standard is a safety valve for future innovation, allowing new features to be added without forcing vendors to invent yet another separate file format or wait for standards bodies to give their approval. While such extensions initially decrease interoperability, it's Burton Group's belief that this issue will resolve itself over time, as popular extensions are adopted by other vendors or eventually move into the baseline specification. Third, OOXML supports “overlay” custom schemas (not in ODF 1.0, promised in ODF 1.2), which can be used as views into the business information stored in documents. This separation of document and views allows enterprises to more easily perform tasks such as programmatically updating a “Stock Price” element or corporate logo within a document, compared to ODF's method of serially inspecting and updating the document itself. In short, because OOXML is more ecosystem- and application-oriented than ODF, most vendors and enterprises will see it as more useful than ODF. The third answer is, “In the long run, perhaps neither.” Software as a service (SaaS) productivity applications are bringing mashups and dynamic web-based documents into the enterprise, challenging the long-held idea that a document must be monolithic and static. Over the next decade, standards being put forth by the World Wide Web Consortium (W3C) may ultimately dominate the document standards domain (not to the exclusion of OOXML and ODF, however, since both use W3C standards; it's something of a circular reference, but ODF and OOXML may eventually become secondary to W3C standards). In short, the blunt question has no single answer. In order to explain the potential roles and impacts of ODF, Open XML, and other Extensible Markup Language (XML)-based content models, it's useful to first review productivity application market dynamics. This section begins with an overview of key market dynamics, continues with an assessment of ODF, OOXML, and other document models, and includes projections and recommendations for organizations seeking to optimize their use of XML file formats.

What's In a Name?: Innuendo
All participants in this debate agree on the name of the first specification: OpenDocument Format, or ODF for short. However, the name used for the second specification varies, often based on who's talking and the impression they're trying to pass along. Microsoft likes to call it “Open XML Formats” (emphasizing the “open” nature of the standard), while IBM and Sun reference it via its Ecma name of “Office Open XML” (as reminder of its origins in the proprietary Microsoft Office file formats). In the spirit of vendor neutrality and a short abbreviation, this overview refers to the Ecma 376 Office Open XML standard as Office Open XML, or OOXML for short. If the official name changes in the future—as it may following the February, 2008 ISO ballot on ECMA-376—this overview will be updated to reflect the revised names thereafter.

5
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Productivity Application Market Dynamics
Productivity applications have been the “killer applications”—the tools that made personal computer (PC) purchases mandatory in many business domains—since the dawn of the personal computing era approximately thirty years ago (e.g., VisiCalc, the first spreadsheet application, was introduced in May 1979). Because the productivity application model has changed in subtly significant ways over the years, this section starts with a review of productivity application domains, models, and workflow.

Productivity Application Domains
The four primary productivity application domains are: • Document: This is the domain dominated by word processor applications. Common document examples include status reports and customer meeting summaries. • Spreadsheet: Spreadsheet applications are widely used for business scenarios, such as expense reports and sales tracking. • Presentation: These applications are used to present and persuade, and are typically more outline oriented than word processor documents. Presentations are widely used in business domains such as new product launches to educate sales team members and prepare them to present new product information to customers. • Database: Database applications are primarily used to manage lists of similarly structured data, such as product price lists and inventory data. Document, spreadsheet, and presentation productivity applications are more widely used than database applications today, in part because popular examples of the latter (a market segment dominated by Microsoft Office Access and Apple FileMaker for the last decade) require more up-front planning and design than work focused on documents, spreadsheets, and presentations. Many information workers also use productivity applications for: • Graphics: Specialized image editors are often used, for example, to prepare graphics for document and presentation tasks. • Domain-specific content: Mathematical equation editors, for example, are widely used within scientific and engineering domains. • Form-based content collection: Forms-oriented applications for structured content are commonly used for business transactions such as equipment purchases and benefit enrollment option updates. • Information item collection/creation and organization: This productivity application category is used for more free-form information management, working with collections of content snippets from websites or other applications. Products such as Microsoft Office OneNote and services such as Google Notebook fit in this category. E-mail is another primary productivity application. For the context of this overview, e-mail tools should be considered a combination of document- and form-oriented editors, along with services for asynchronous messaging. E-mail is also widely used as a productivity application file transfer mechanism (i.e., sending e-mail messages with file attachments); this usage pattern that often creates information management challenges. Productivity applications have been phenomenally successful, driving the global deployment of PCs. Microsoft estimates that more than 500 million people currently use Office, for example, and OpenOffice.org, the organization behind the open source productivity application suite of the same name, estimated 100 million downloads of OpenOffice.org-based offerings as of September 2007.

6
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Initially restricted to professional information worker roles, such as administrative assistants and accountants, productivity applications are now routinely included in elementary school curricula, and the global collection of information captured in productivity application files is measured in exabytes (i.e., quintillions of bytes). A key problem with productivity applications, of course, is the fact that they have not, historically, been very effective for information management: Because files tend to accumulate on PC hard disks and as e-mail file attachments, it's impossible to precisely calculate the total amount of information currently captured in productivity application files.

Productivity Application Model Concerns
While the simplicity and power of productivity applications have made them attractive to hundreds of millions of people, the underlying conceptual models can be quite elaborate. This section summarizes the primary productivity application model concerns.

Content
Productivity application content includes text, numbers, dates, images, and other basic types of information. A sentence in a document such as this is a type of content.

Presentation
Presentation concerns include attributes such as font, type size, display style, and color. (Note that “presentation” in the model concern context is not the same thing as the presentation category of productivity applications.) In this overview, for example, the subsection heading preceding this paragraph is associated with a presentation style that specifies font, type size, and other attributes.

Structure
Structural concerns are related to content models. A document, for example, is based on structural elements such as paragraph hierarchy and sequence. A spreadsheet is defined in terms of structural elements including workbooks, worksheets, rows, columns, ranges, and cells. Common database structural elements include tables, columns, forms, and reports. Some structural elements are consistent across all productivity application domains. Examples include basic metadata such as the file author, the date/time the file was last updated, and reviewer comments. Embedded links are also structural elements; “Burton Group's website,” for example, combines the text (content) “Burton Group's website” with a Uniform Resource Locator (URL), in this case, http://www.burtongroup.com/. Structural elements can also include custom schemas specific to business domains. An order form, for example, is likely to include collections of related structural content elements for a customer (e.g., customer name, identifier, and shipping address) and a product (e.g., product name, identifier, description, and list price). Productivity applications are powerful in part because they make it possible for information workers to stay focused on domain-specific content without having to be concerned with underlying structural details.

Behavior
Productivity application behaviors include operations based on user actions and content changes. Some behaviors are included in the applications, such as saving files when users select File/Save menu items (or click related toolbar icons). Most productivity applications also support behaviors such as launching a web browser with a URL when an embedded hyperlink is clicked.

7
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Behaviors may be associated with business processes. An expense report may include a “submit” button, for instance, programmed to save, digitally sign, and send the resulting file to the expense report creator's manager for review and approval. Tools for capturing this type of behavior range from keystroke macro recorders to sophisticated programming environments.

Application Settings
Productivity application settings refer to user-selectable options, such as whether rulers are displayed or edits are captured in Track Changes mode. These settings also include action-based attribute values, such as the cursor position when a file was last saved, or the printer selected when a document was last printed. Application settings may seem mundane, but information workers tend to get annoyed when their settings are not preserved across editing sessions, so settings constitute an important part of the productivity application model.

Packages
Productivity application packages are container mechanisms for all of the elements referenced in this section. A productivity application package is typically a multipart file with a manifest defining the package parts and relationships between the parts. Although a file with the extension .doc is widely understood to be a Microsoft Word document, for example, the file is actually a multipart package based on elaborate schemas.

Templates
Productivity application templates are used to enforce model elements across template-based files. Rather than having employees create expense report spreadsheets from scratch, for example, most organizations create and manage expense report template files that include presentation, structure, and behavior elements and settings. From a model point of view, templates are essentially productivity application file schemas.

Productivity Application Workflow
While productivity applications have been fundamental to the success of personal computing, many productivity application usage patterns today are actually interpersonal in nature, being used for workflow scenarios that help people communicate and collaborate through content managed in productivity application files. Workflow actions, as depicted in Figure 1, usually involve content creation, review/comment, and publication.

Figure 1: Productivity Application Workflow Although a group working within a single department of an organization may start and end the workflow using the same applications, it's increasingly common for different applications or tools to be used for different workflow steps. A business process shared between business partners, for example, may start in Microsoft Office Word, be saved to a Portable Document Format (PDF) file for external review and comment, and be published to a Hypertext Markup Language (HTML) webpage when the workflow is complete. In many cases, the productivity application file is ephemeral, used only to present and capture user actions for business transactions that are ultimately captured in enterprise systems rather than stand-alone productivity application files (i.e., the files are destroyed when the transaction is complete and captured in other systems of record).

8
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

The Software as a Service Shift
The advent of software as a service (SaaS) offerings focused on productivity application domains is another important market dynamic. Vendors such as Central Desktop, EditGrid, ThinkFree, and AdventNet (creator of Zoho) offer low- or no-cost tools for documents, spreadsheets, presentations, databases, and other domains. Although such SaaS offerings are sometimes relatively simplistic, in terms of productivity application features, they typically augment the traditional models with communication and collaboration capabilities, exploiting their underlying service-oriented and web-centric architecture. For work scenarios in which communication and collaboration are critical, and content is relatively straightforward, SaaS offerings can thus be more productive than traditional, stand-alone productivity applications. SaaS productivity application-related offerings are also available from major vendors such as Google and Adobe Systems. Google Apps is probably the most widely publicized, due to the broader competitive landscape between Google and Microsoft. The Collaboration and Content Strategies report “Google Apps in the Enterprise: A Promotion-Enhancing or Career-Limiting Move for Enterprise Architects?” provides an extensive review of Google Apps. Google also distributes Sun StarOffice (a non-SaaS, traditional productivity application suite) with Google Pack, a free collection of productivity tools and services. Adobe is a recent entrant to the productivity tools SaaS market with its September 2007 acquisition of Virtual Ubiquity, creator of the Buzzword web word processor. Adobe Buzzword, as suggested in Figure 2, is unlike many other web-centric document editors because of its focus on rich user experience, precise document layout, and intra-document collaboration.

9
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Figure 2: Adobe Buzzword (Beta) Example Microsoft is also exploiting the SaaS wave in its enterprise- and consumer-oriented offerings. The screen shot in Figure 3, for example, is Excel Services, a feature in Microsoft Office SharePoint Server (MOSS) 2007 that makes interactive spreadsheet features available to people using web browser clients.

10
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Figure 3: Microsoft Excel Services Example (Source: Microsoft) Excel Services makes Excel capabilities available to information workers without requiring the traditional Excel client. It also facilitates improved information management, because all users interact with a single copy of the underlying spreadsheet. Microsoft offers similar capabilities in Microsoft Office Forms Server 2007 (a capability also bundled with MOSS), which makes InfoPath forms accessible to browser clients. In the consumer-oriented SaaS market, Microsoft Office Live Workspace is a free service offering for Office users who wish to share and collaborate on the Web (Office Live Workspace is also SharePoint-based). In all cases, Microsoft emphasizes a “software plus services” value proposition, offering the services to augment rather than displace its traditional productivity applications. SaaS offerings in part represent an important market dynamic because they often rely on database management systems (DBMSs), rather than traditional files, for storage. Central Desktop is built on PostgreSQL, for example, and both Adobe Buzzword and Microsoft SharePoint use Microsoft SQL Server for storage. The services also support traditional files, primarily for import/export and print-oriented workflow scenarios, but their primary storage models are not file based.

Recap: New Realities and Requirements
Although some scenarios can still be addressed with basic document editor or spreadsheet tools, most large organizations are now working with more elaborate and workflow-based usage patterns. Ubiquitous e-mail, the Internet, PCs on many desktops, laptops on many laps, and wireless connections have all combined to make collaboration easier and even expected.

11
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Some increasingly common productivity application requirements include: • Document assembly: Rather than using monolithic files, document assembly means dynamically composing documents, often from disparate sources. For example, a sales report may be generated from a document template and interactive queries into sales tracking systems such as Salesforce.com. • Content reuse: Improving content reusability entails a shift to managing content components (also known as information items and microformats) rather than monolithic files. Examples include the need to consistently use corporate branding and legal boilerplate text in business proposal documents. • Content query: To make productivity application content a more productive resource in broader information management (e.g., to easily find all information pertaining to a specific customer or research project, regardless of content type or location), organizations need to go beyond simple content indexing and exploit metadata ranging from basic fields and tagging/categorization to custom schemas. • Document inspection and sanitization: Requirements in this context include ensuring authors haven't inadvertently left reviewer comments or other remnants from work-in-process versions in productivity application files. Inspection and sanitization are also used to ensure that content complies with organizational policies (e.g., to automatically remove unacceptable or potentially offensive terms from documents before they are distributed). • Document archival: To integrate productivity application content with corporate systems of record for information management and record-keeping requirements. Until recently, it has been exceptionally difficult to address these requirements because of constraints associated with the binary and proprietary file formats used with traditional productivity applications. These historical file formats present both developer complexity and intellectual property challenges that greatly complicate interoperability and workflow. Intellectual property issues also create concerns about vendor dependencies, especially for content that must be maintained for time periods that may exceed the lifetimes of the software product vendors that created the productivity applications (or services) used to create the content. These new realities and requirements have resulted in the widespread need for new productivity application content models that are: • XML-based and technology and vendor/product independent in order to reduce the potential for vendor lock-in, expand competition, and facilitate the development of ecosystems of vendors providing complementary products and services. • Built on open schemas for common productivity application domains including documents, spreadsheets, and presentations. • Designed to support a clean separation of concerns for content, presentation, structure, and behavior in order to facilitate improved content reuse and developer productivity. Separation of semantics and presentation also enables better accessibility support. • Compatible with the exabytes of existing productivity application content, since no organizations have the luxury of instantly ripping and replacing already-deployed productivity applications, and most organizations also need to exchange content with other organizations that can't be expected to be using the latest productivity applications. The market-dominating position currently enjoyed by Microsoft Office is a multifaceted consideration in this context. First, the fact that the vast majority of existing productivity application content is captured in binary and proprietary Office file formats is a pivotal concern, as any content model alternative that alters content, presentation, structure, or behavior concerns is problematic. This is especially troubling for organizations that rely on operating systems for which Microsoft doesn't offer Office, including all Linux and UNIX clients. A second important consideration pertaining to Microsoft file formats is the fact that many Microsoft competitors have attempted to exploit the transition to open, XML-based file formats in order to more effectively compete with Microsoft Office. While some of its competitors have very little hope of establishing successful products that directly compete with Microsoft Office, they may still seek to disrupt Microsoft's business by making Office less profitable, thus depriving Microsoft of opportunities to use Office profits to subsidize product development efforts in other market segments.

12
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

With these new realities and requirements in mind, it's time to turn to a review of the productivity application file format alternatives. Despite the somewhat mundane context that is file format focused, the story offers dimensions worthy of a James Bond movie, including multibillion dollar business empires, global political intrigue, and even some conspiracy theories at the intersection of capitalism, communism, and democracy.

ODF, OOXML, and Other Document Models
Several XML-based document model alternatives have emerged over the last few years, all designed to address the requirements outlined in the “Productivity Application Market Dynamics” section of this overview. This section reviews the leading alternatives.

OpenDocument Format
OpenDocument Format (ODF) is an open, standards-based (and standardized) XML document model for word processing, spreadsheet, and presentation files. ODF is currently somewhat simple (and simplistic) compared with alternatives such asOOXML, but it was the first XML productivity application file format to be standardized, and as such has been promoted in association with many government and other domains in which standards are mandated. In terms of productivity application model concerns, ODF is primarily focused on content and presentation domains, and it is far less useful for scenarios requiring advanced structure and behavior capabilities. For example, ODF (currently in a 1.1 revision) supports a single table type for use within document, spreadsheet, and presentation applications and offers a rudimentary spreadsheet formula language. In another important consideration for enterprise application developers, ODF does not support custom schemas: It concerns itself only with defining the document format. Structured elements can be included in ODF document content, but no mechanism enables associating content elements with external (to the document content) schemas. The inability to do an external schema overlay—somewhat similar to a view in database parlance—hinders enterprises when programmatically manipulating the document, such as updating a stock price element within a document. ODF started in 1999 with an XML model created by the StarOffice group within Sun Microsystems. StarOffice, a multi-platform productivity suite was created in 1994 by StarDivision, a productivity application company founded in 1984. Sun acquired StarDivision in 1999 for $73.5 million and released much of the source code of StarOffice in June 2000, fostering the creation of an open source initiative called OpenOffice.org. The first Organization for the Advancement of Structured Information Standards (OASIS) ODF Technical Committee (ODF-TC) meeting was held on December 16, 2002. ODF was approved as an OASIS standard on May 1, 2005, and was subsequently approved as an International Organization for Standardization (ISO) standard on May 3, 2006. While there is debate about the extent to which ODF is tied to OpenOffice.org, and some ODF-compliant productivity applications are not derived from OpenOffice.org (such as KOffice), ODF started as a refined XML serialization of OpenOffice.org file formats and the two initiatives are still very closely aligned. Indeed, several vendors interviewed for this overview indicated that it's essentially impossible to get ODF proposals approved if they're not also supported in OpenOffice.org, and further noted that Sun closely controls OpenOffice.org (much as it also holds control over Java). ODF has been strongly supported by IBM, Novell, Sun, and many other smaller and specialized vendors (see the list of ODF Alliance members for examples). ODF has been publicized in part because of its selection by many government organizations, including, controversially, the state of Massachusetts, which generated significant news and blogosphere coverage in 2005 by announcing that it would require ODF rather than Microsoft Office file formats.1 The 2005 Massachusetts policy implied, at that time, that the state would also abandon Microsoft Office, a move many Microsoft competitors and open source advocates heralded as a significant leading indicator.

13
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Additional ODF resources can be found in the “ODF Resources” section of this overview. In terms of establishing sufficient context to compare and contrast ODF with OOXML and other XML document model alternatives, it's important to understand that many ODF advocates have conflated open standards and open source. They imply ODF is superior to OOXML in part because ODF-based productivity suites, such as OpenOffice.org and KOffice, are open source initiatives, while Microsoft Office 2007, the only productivity suite that is fully based on OOXML, is closed source. Surprisingly to some Microsoft detractors, Microsoft also supports ODF, announcing in mid-2006 that it sponsored an open source project focused on creating an OOXML/ODF file translation adapter for Microsoft Office. This was a pragmatic move on Microsoft's part, because, if successful, the adapter would make Office viable for organizations such as the Commonwealth of Massachusetts that (at that time) mandated ODF support. Microsoft also highlighted, in its press release announcing the investment, that ODF constraints would entail “certain compromises and customer disclosures” since ODF is not compatible with the full range of expressions possible in Microsoft's traditional file formats (e.g., file metadata and markup may be lost in workflows involving Office and non-Office client applications). Due to a variety of reasons—file incompatibilities mentioned above, usability issues with non-Microsoft suites, and OOXML becoming an Ecma standard—Massachusetts eventually endorsed OOXML along with ODF. Overall, ODF can be a useful option for organizations that have relatively simple document model requirements and that do not always require full-fidelity round-trip workflow support with Microsoft Office applications. ODF, through its support in free productivity application suites such as OpenOffice.org, has also been promoted in conjunction with budget-constrained domains such as developing nation government agencies and educational institutions.

OOXML
OOXML is, as is ODF, an open, standards-based (and standardized) XML document model for word processing, spreadsheet, and presentation files. In terms of productivity application model concerns, OOXML was designed to fully address document, spreadsheet, and presentation concerns for all Microsoft Office file formats, including both the default XML file formats used in Office 2007 and the proprietary and binary file formats used in earlier releases of Office. As such, OOXML is considerably more expressive and complex than ODF, including support, for example, of multiple table types (i.e., different table-based capabilities for documents, spreadsheets, and presentations), a full spreadsheet formula language, and multiple options for incorporating custom XML schemas. OOXML's support for custom XML schemas is important for integrating productivity application content with external systems. It's also pivotal to Microsoft's ability to offer features in Office 2007 that are outside the scope of the OOXML (Ecma International) standard, such as Microsoft Office PowerPoint 2007 slide libraries, a feature supported in conjunction with MOSS 2007. OOXML's roots can be traced to the late 1990's, when Microsoft introduced Office 2000, a release that included limited XML support. Microsoft expanded XML support in Excel for Office XP (released in 2001), and significantly expanded XML support in Word and Excel in conjunction with Office 2003. Office 2003 also introduced the use of custom XML schemas inside Word and Excel. Office 2007, a vast development project that started several years before its January 2007 commercial release, adopted what Microsoft called Open XML as the default file format for Word, Excel, and PowerPoint. Note that it wasn't possible for Microsoft to embrace ODF during this period, as Microsoft's investment in XML file formats was long underway before ODF became an OASIS standard in 2005 (and because of Microsoft's need to maintain compatibility with earlier Microsoft file formats). In terms of standardization, Open XML was submitted to Ecma International in November 2005. The first Open XML-focused Ecma International Technical Committee (TC45) meetings were held during December 2005. TC45 was chaired by Microsoft and included members from Apple, Intel, NextPage, Novell, and other vendors, along with user organizations including the British Library and the U.S. Library of Congress. Ecma International approved Open XML as a standard (ECMA-376) in December 2006 and changed its name to Office Open XML.

14
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Recognizing that Ecma International standardization was insufficient for some organizations, OOXML was also submitted (by Ecma) to ISO for its fast track standardization process. In September 2007, after a controversial ballot phase, ISO announced that OOXML had not received the required number of votes for fast-track approval, and the Ecma OOXML supporters started working to address the complaints raised during the ISO balloting process. A final ISO ballot is currently scheduled for February 2008. Additional OOXML resources can be found in the “OOXML Resources” section of this overview. In terms of establishing sufficient context to compare and contrast ODF with OOXML and other XML document model alternatives, it's important to understand that OOXML is already a legitimate (Ecma International) standard as well as an emerging de facto standard based on its use as the default file format for Office 2007. While ISO standardization would accelerate the use of OOXML in many standards-focused organizations, the February 2008 ISO ballot will not determine the overall fate of OOXML. In the considerable market debate about the degree to which OOXML is a meaningful standard, topics include the fact that OOXML is only fully implemented by a single vendor (Microsoft) and the possibility that full use of OOXML entails the use of proprietary Microsoft technologies in domains such as digital rights management. Also, something of an intellectual property expert full-employment act is in effect as Microsoft's patents, licenses, and other business policies are being scrutinized for any scenarios in which Microsoft could inappropriately leverage OOXML to its advantage. For an example of the critical views of OOXML, see “Microsoft and Open Standards,” a Microsoft-critical webpage that includes a useful overview of several concerns and a collection of links to other resources such as IBM and Google comments on the September 2007 ISO OOXML ballot process. The debate and scrutiny are not surprising, given Microsoft's historical track record as an extremely aggressive competitor and convicted monopolist, but it's important to understand that Microsoft appears to be sincerely committed to making OOXML a substantive standard, a topic revisited within the “Projections” section of this overview. It's also important to recognize that some of Microsoft's competitors may be hypocritically engaging in precisely the sorts of behind-the-scenes power plays that they've asserted Microsoft, if left to its own devices, would inevitably employ.

Other Important Document Models
ODF and OOXML are not the only alternatives for XML document models. This section provides an overview of PDF and UOF, two other important models. Portable Document Format (PDF) is a file format for platform-independent document exchanges. Introduced by Adobe in 1993, PDF was based on a vision described in 1991 as the “Camelot Project” by Adobe co-founder John Warnock. While PDF does not address productivity application model concerns at the same level of abstraction as ODF and OOXML, it is very widely used for workflow phases including review, comment, and publication, in part because Adobe Reader, freely available on a variety of operating system platforms, facilitates full-fidelity document viewing, printing, and workflow. PDF is a de facto standard (e.g., it is the preferred format for electronic submissions to the U.S. Food and Drug Administration [FDA]). It is also an industry standard, with subsets optimized for domains such as document archival (PDF/A, or ISO 19005-1:2005) and prepress printing (PDF/X, described in ISO Standards 15929 and 15930). In early 2007, Adobe announced plans to submit the entire PDF model for standardization; ISO voted to approve PDF 1.7 as a standard in December 2007 (ISO 32000). Adobe's XML architecture integrates the PDF model with XML for metadata management and the use of custom XML schemas. Adobe also has a research project called Mars that is focused on creating an XML-based representation of PDF documents. UOF, for Uniform Office Format, is associated with the “Specification for the Chinese office file format based on XML,” a national standard of the People's Republic of China. UOF was published in 2005, after three years of standards work by the Chinese government, software vendors, and academic institutions. UOF was influenced by the needs of RedOffice, a Chinese productivity application suite based on OpenOffice.org (and thus ODF). For more background information, Rob Weir, an IBM employee focused on ODF, maintains a page referencing several UOF articles and resources.

15
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

In April 2007, several Chinese government agencies and Sun Microsystems Chairman Scott McNealy called for consolidation of UOF and ODF, but the work required to reconcile the standards is expected to be complex and time-consuming. As a further complication, some ODF advocates believe intellectual property and patent restrictions are associated with UOF. When interviewed for this overview, one vendor expressed uncertainty about how much it was going to be able to do with UOF support due to patent restrictions, and also noted that Sun's relationship with Redflag, the Chinese vendor responsible for RedOffice (more precisely: the Beijing Redflag CH2000 Software Co, owned in part by the Chinese Academy of Sciences), is unusual in many respects. For example, Sun allows Redflag to make changes to OpenOffice.org without giving Sun control of the resulting intellectual property, as Sun requires for other OpenOffice.org-focused vendors. Although it will take years before market patterns become clear in terms of the relative success levels of ODF, OOXML, PDF, and UOF, it's clear that none of the alternatives is entirely free of (direct or indirect) intellectual property and licensing complications.

File Format Translators: Actual Results May Vary . . .
Most organizations will be working with a variety of file formats for the foreseeable future, both because productivity application upgrades can be protracted processes and because of the need to participate in file-based workflows with external organizations. Several file format translation tools have been introduced to simplify interoperability. Some of the major options include: • The Microsoft Office Compatibility Pack for Office XP and Office 2003, that updates Office applications to provide the ability to open and save OOXML files. Microsoft also provides an option for Office 2000 users to convert OOXML files to binary file formats, which can then be opened in Office 2000. • An OpenXML/ODF Translator Add-in for Office, the result of an open source initiative sponsored by Microsoft (with key contributor firms including Clever Age, DIaLOGIKa, and Aztecsoft) and hosted by SourceForge. This add-in makes it possible for Office applications to open and save ODF files, although Microsoft doesn't directly support the add-in and has noted that working with ODF can cause some file fidelity loss. • An ODF-UOF Converter open source initiative, also hosted by SourceForge, and designed to work with popular Chinese productivity application suites including RedOffice and EIOffice. • The Novell OpenOffice.org OpenXML Translator, which enables OpenOffice.org users to directly open and save OOXML document files. • The Sun ODF Plugin 1.1 for Microsoft Office, which, like the OpenXML/ODF Translator Add-in for Office, enables Office users to open and save ODF files. The plug-in works with Office 2003 and Office XP documents, spreadsheets, and presentations; support for Office 2007 is planned for a future release. Sun notes that its plug-in is based on StarOffice technology and is the only converter available with enterprise support. Since ODF is less compatible with the earlier binary Microsoft file formats than OOXML, file format translations involving ODF can result in the loss of file fidelity, a constraint that limits ODF's utility for organizations that need to support file-based workflow involving Microsoft and non-Microsoft applications.

The Parallel Universe of the World Wide Web
Most people don't immediately think of the World Wide Web or related standards created by the W3C in the context of productivity applications, but the W3C has created a family of standards that, collectively, is very relevant to productivity application market dynamics. As the Adobe Buzzword and Microsoft Excel Services examples in “The Software as a Service Shift” section of this overview suggest, web-centric approaches can be both complementary to—and competitive with—traditional productivity applications. Indeed, in many respects the web-centric model is already ahead of evolving non-web alternatives in important areas such as content reuse (see the Collaboration and Content Strategies overview “Content Reuse: DITA, XML, and Other Ways to Keep from Reinventing the Content” for more details). Examples of key model concerns addressed by the W3C include: 16
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

• • • •

Content: XML Presentation: XHTML and Cascading Style Sheets (CSS) Structure: XML Schema, XML Pointer Language (XPointer), and XML Linking Language (XLink) Behavior: XForms (which has overlap for structure as well)

Although the W3C has not directly addressed application domains for documents, spreadsheets, and presentations, several vendors and open source initiatives have built on W3C technologies to create productivity application products and services including the example referenced in “The Software as a Service Shift” section of this overview. The W3C has also created standards that are directly relevant for related areas such as: • Graphics: Scalable Vector Graphics (SVG) • Equations: MathML • Multimedia: W3C Synchronized Multimedia Integration Language (SMIL) In some respects, the web-centric approach represents something of a Copernican revolution for productivity applications. While the traditional model has productivity applications and related files at its core, the web-centric approach is centered on collections of webpages, and the pages may contain components for document, spreadsheet, presentation, and other needs. The W3C has also chartered working groups to simplify the integration of its myriad related standards, including Web Integration Compound Document (WICD) and Compound Document Formats (CDF) groups. The latter garnered some press attention in October 2007 when it was positioned as an ODF alternative by a then-disbanding group called the OpenDocument Foundation (see this Gary Edwards blog post for more details). The OpenDocument Foundation, as its name suggested, was initially created by several participants of the OASIS ODF-TC who sought to expand the ecosystem of ODF supporters. It was unable to successfully address requirements for capabilities related to workflows with Microsoft Office applications, however, and parted ways with the OASIS ODF-TC after it failed to gain approval for several proposals designed to make ODF more capable for addressing full-fidelity file format operations involving Office clients.

Reciprocal Benefits: Standards and Standards Procedures
Despite all of the debate and controversy surrounding ODF and OOXML, it's important to recognize that the standards organizations are working as designed, and that both the standards and the organizations are constructively evolving as a result. The ISO review process for OOXML resulted in several thousand suggestions for improvements, all of which Ecma International OOXML supporters set out to address before the February 2008 final ISO ballot. Similarly, the OASIS and ISO groups focused on ODF have provided a useful venue for industry debate about the relative importance of Microsoft Office file compatibility in the ODF standard. The standards procedures have been challenged, by, for example, Microsoft and the broader OOXML community recruiting new participants for the ISO OOXML balloting process during mid-2007. It's likely ISO will revise its procedures to prevent similar disruptions in the future, so in some respects the OOXML episode will produce some useful stimulus/response improvements within ISO. Debate will continue about whether it's meaningful or useful for a standards organization such as ISO to support more than one standard in a given domain, but it's clear that ODF and OOXML started with different priorities and requirements, and that they're not interchangeable in all contexts. It's also clear that different standards organizations have different modus operandi, with the W3C often moving more swiftly, successfully, and apolitically than standards bodies such as ISO. Broad recognition of OOXML as a legitimate (real and de facto) standard is probably not what some of Microsoft's competitors had in mind when they started promoting ODF standardization as a reason to displace Microsoft Office, but several historical examples of attempts to use (real or planned de facto) standards as a wedge in competing with Microsoft include: • Vendor Independent Messaging (VIM), which was an attempt to displace Microsoft's Messaging Application Programming Interface (MAPI) 17
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

• Integrated Database Application Program Interface (IDAPI), a failed attempt to out-standardize the Microsoftsponsored Open Database Connectivity (ODBC) • OpenDoc, a compound document model in the mid-1990's that was originally code-named “Bento”; OpenDoc was developed by Apple, IBM, and other vendors to compete with Microsoft's Object Linking and Embedding (OLE) model. In these and other earlier encounters, when Microsoft's competitors sought to collectively compete with Microsoft by leveraging standards, the everybody-but-Microsoft standards have generally failed to achieve significant market momentum. Of course, Microsoft has also attempted and failed to direct some industry standards, such as Java. The relative success of ODF and OOXML, in any case, will be determined more by its utility and which community effectively exploits W3C standards than it will by one or the other more effectively navigating through ISO standards procedures.

Recap: File Format Form Follows Function
The utility of a model is assessed relative to stated objectives in a given domain; a modeling endeavor never has a simple right or wrong answer. For enterprise-class productivity application scenarios, especially those involving interoperability with earlier (binary and proprietary) file formats, the model requirements are both intricate and expansive. Productivity applications have become a multibillion dollar business and are used by hundreds of millions of people worldwide, in part to isolate information workers from the underlying complexity. Having surveyed the productivity application market dynamics and alternative XML file format models, it's time to explore projections about the likely relative success of ODF and OOXML at addressing current and emerging productivity application requirements.

Projections
If productivity application evolution had peaked around the feature set of Office 97, and if there weren't exabytes of files captured in earlier Microsoft Office file formats, it's possible that ODF could have succeeded as a global standard for productivity application file formats. Instead, vendors of these applications continue to innovate in a wide range of approaches ranging from Office 2007 to SaaS solutions, and the installed base remains a critical concern. The rest of this section explains several projections about what's likely to happen next in terms of productivity application XML model market dynamics.

OOXML Will Be Successful
OOXML will be more pervasive than ODF for several reasons. It's a better form-follows-function fit for most productivity application usage patterns and, much to the chagrin of Microsoft competitors, Microsoft appears to be sincere in its efforts to make OOXML a meaningful and global industry standard. OOXML is considerably more complex than ODF, but it's not unnecessarily complex for the contexts it was designed to address. It's also important to recognize that a very small percentage of application developers and content designers will actually need to master the totality of OOXML. Just as relatively few people have mastered all facets of languages and models such as PostScript and PDF, the vast majority of people who interact with OOXML will do so through developer frameworks and tools that work at higher levels of abstraction. OOXML will gain market momentum as vendors such as Altova and Mindjet introduce products that support OOXML. Altova, the leading vendor of tools for XML-focused developers and designers, added OOXML to its XMLSpy product line in 2007. Ominously for ODF advocates, Altova Chief Executive Officer (CEO) Alexander Falk commented, in an interview for this overview, that Altova has not received a single request for ODF support, while the OOXML features in XMLSpy were created in response to significant Altova customer demand (Alexander Falk noted that he is often asked about ODF, but only by journalists and industry analysts). 18
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

The OOXML ecosystem is also supported by community resources such as OpenXMLDeveloper.org and OpenXMLCommunity.org. Unsurprisingly, both sites were created by Microsoft, but they also facilitate open collaboration in discussion forums (including extensive critical commentary). The scope of OOXML is likely to expand to include application domains such as databases, but the expansion probably won't gain momentum until after the February 2008 ISO OOXML vote, as OOXML-focused people at both Ecma International and Microsoft will remain focused on addressing issues raised during the previous vote cycle until February 2008.

Microsoft Will Aggressively Compete but Also Play Well with Others on OOXML
Without a doubt, Microsoft created OOXML in order to expand market opportunities for Microsoft Office, a nontrivial challenge for a product with market share greater than 90% in many areas. Critics have asserted Microsoft will go to extremes to protect its phenomenally profitable Office business, and that's certainly the case in terms of the billions of dollars Microsoft invested in Office 2007 and additional investments it has made in complementary (and complimentary [i.e., free]) services such as Office Live Workspace. While there have been episodes of over zealousness in the drive for ratification—for example, Microsoft Sweden's offering marketing contributions to its partners as a way to encourage them to vote for OOXML in the ISO approval process—these appear to be the exception. Considering the global scrutiny applied to ISO and other standards processes, plus the fact that Microsoft is even more closely scrutinized because of its position as a convicted monopolist, it would be self-defeating for Microsoft to attempt to subvert the standards processes or somehow establish an intellectual property-based advantage for itself in order to thwart other vendors seeking to exploit OOXML. Realistically, it's also extremely unlikely another vendor will attempt to exploit OOXML in order to produce a comprehensively competitive alternative to Microsoft Office, given the considerable resources that would be required to do so and the economies of scale that would need to be achieved in order to make the endeavor sustainably profitable. In other words, although many Microsoft competitors face challenges based on the dominant market position enjoyed by Microsoft Office, no clear evidence suggests that Microsoft is illegally abusing (or planning to abuse) OOXML in a concerted fashion to protect its Office franchise. While some have argued the OOXML documentation is incomplete or inconsistent, for example, it's highly likely Microsoft's own developers faced similar frustrations when working with Microsoft-internal Office documentation—and both Ecma and Microsoft have established an impressive track record for addressing shortcomings in the OOXML design and documentation.

ODF Will Continue, Albeit in a Relatively Minor Role
ODF represents laudable design and standards work. It's a clean and useful design, but it's appropriate mostly for relatively unusual scenarios in which full Microsoft Office file format fidelity isn't a requirement. Overall, ODF addresses only a subset of what most organizations do with productivity applications today. It will continue to evolve (e.g., adding support for multiple table types, a spreadsheet formula language, and custom XML schema support), but ODF's evolution will likely be slow and complex, in part because of the fact that OpenOffice.org, the primary implementation of ODF, is arguably still, in some respects, controlled by Sun Microsystems. ODF advocates often suggest that it, along with OpenOffice.org, will be successful in extremely budgetconstrained environments where Microsoft Office doesn't have deep penetration, such as government agencies in developing nations. Success in even this demographic is not a foregone conclusion for ODF, however, partly because Microsoft is aggressively competing in this market segment as well with, for example, Microsoft Student Innovation Suite (including Windows XP Starter Edition, Microsoft Office Home and Student 2007, and other tools) for $3 per user for governments and students.

19
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

It's also not clear that ODF will prevail on platforms Microsoft doesn't support for Office, such as Linux and UNIX, since vendors such as Novell have provided translators that make it possible for OpenOffice.org users to save to OOXML on non-Microsoft platforms. The most significant value of ODF-based alternatives to Microsoft Office, for many organizations, may be in establishing a viable alternative that provides opportunities to negotiate more favorable pricing/licensing agreements with Microsoft. This doesn't mean OpenOffice.org-derived productivity suites are actually more costeffective than Office, since most enterprises would opt to purchase support and maintenance contracts for the Office alternatives (e.g., IBM Lotus Symphony, the Novell Edition of OpenOffice.org, or Sun's StarOffice) and since Microsoft has been able to exploit very large economies of scale to keep Office relatively inexpensive. While competitors compare their prices with the official retail prices of Microsoft's offerings, the reality is that Microsoft heavily discounts some of its Office products. In 2003, Microsoft began selling Microsoft Office to the Chinese government for $10 a seat. As of December 2007, Microsoft Office Home and Student 2007, for example, is selling for around $125 (suggested retail price is $149) with a license to install on up to three PCs, and an “Ultimate Steal” option for many students to purchase Office Ultimate 2007 for Student/Teacher Edition selling for approximately $60 (less than 10% of the full Office Ultimate list price of $679). Although Sun could considerably simplify ODF-related processes by yielding full control of both ODF and OpenOffice.org to standards bodies and open source initiatives, it's probably already too late for such a move to make a significant difference to ODF's trajectory. Sun appears to have placed strategic bets that (1) it could continue to control ODF (through OpenOffice.org), under the auspices of international standards, without thwarting industry support for the standards, and that (2) Microsoft would seek to subvert standards procedures in order to inappropriately gain competitive advantage with OOXML. If Sun implicitly or explicitly placed these bets, it appears to have lost both.

The W3C Model Will Prevail in Many Domains
The W3C model, building on XML, CSS, and other W3C standards, is likely to ultimately be more influential and pervasive than ODF and OOXML. It's also something of a circular reference, of course, since both ODF and OOXML are defined in terms of several W3C standards, and web-centric productivity application view mechanisms such as the Excel Services example in “The Software as a Service Shift” section of this overview are rendered exclusively in web standards. In terms of the primary productivity application domains (documents, spreadsheets, and presentations), the W3C model is more complementary than competitive with ODF and OOXML. That could change, however, if W3C initiatives such as XForms gain market momentum. Vendors including EMC (through Documentum) and IBM (with IBM Lotus Forms) have placed strategic bets on XForms, and XForms is also supported in ODF (but not OOXML). W3C-led standards may also prevail in domains such as digital rights management, again potentially constraining the scope of future ODF and OOXML expansions. XML Query (XQuery) is another pivotal W3C standard that will influence the development of ODF and OOXML. The Collaboration and Content Strategies report “XQuery and Its Implications for Content and Data Management” provides more XQuery information. For now, it's likely that web-centric, SaaS-focused productivity application vendors such as Adobe, Central Desktop, EditGrid, Google, and AdventNet (Zoho creator) will eventually need to add OOXML support for import/export operations, regardless of the storage models they use within their own offerings, if only for interoperability with Office 2007 files. Apple, for example, already supports OOXML in its iPhone (Apple also beat Microsoft in terms of productivity application support for OOXML on Mac OS X, as Apple's iWork product added support for OOXML before Microsoft added OOXML support to Mac Office).

20
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

The expanding role for the W3C model is part of a broader market shift from file-based to hypertext models, a trend explored in the Collaboration and Content Strategies overview “Hypertext and Compound/Interactive Document Models: Collaboration and Content Management Implications.” As blogs and wikis continue to gain momentum (often displacing traditional productivity applications), the hypertext model will become more pervasive and influential. It's still not an either/or proposition, however, as many information workers will spend more time using web-centric editor/viewer tools and continue to launch content into traditional productivity applications for beyond-the-basic requirements.

PDF Will Continue to Dominate Nonrevisable Document Contexts
PDF, especially the PDF/A standard focused on document archival, will continue to play a key role for nonrevisable document domains, such as reviewing and printing final-format documents (e.g., business contracts and tax forms). PDF is very effective for needs in these contexts, and it's also already broadly deployed. If ODF and OOXML didn't exist, it's possible the role for PDF could have expanded, addressing a wider range of hypertext document scenarios for emerging productivity application usage. Instead, market momentum for ODF and OOXML will likely obviate the Adobe Mars project and continue to relegate PDF to a role primarily focused on cross-platform/product document viewing and printing. Even Adobe's own Buzzword service will likely add support for OOXML—again, if only for Office 2007 file compatibility.

New Vendor Challenges and Opportunities
The rapid expansion of OOXML and the more limited, but still influential, market uptake of ODF will present some significant challenges and opportunities for software vendors. Most significantly, the use of OOXML in Office 2007 will dramatically increase the amount of XML content used within large organizations, and this will create opportunities for vendors focused on XML content management. Altova, for example, now has a new category of content to be used in conjunction with its XMLSpy product, and XML content management system vendors such as Mark Logic will have new opportunities to add value in XML content management scenarios. In terms of challenges, the growth of OOXML is likely to create new problems for vendors and open source initiatives that compete with Microsoft Office. Novell may prove to be an exception, as its multi-platform version of the OpenOffice.org suite includes support for OOXML (along with other customer-driven extensions such as support for Visual Basic for Applications [VBA] in its OpenOffice.org spreadsheet program). As the second most successful commercial Linux platform vendor, Novell is likely to benefit from its support for both OpenOffice.org (the productivity application suite) and OOXML (the file format). Traditional content management vendors that aren't moving to embrace XML and XQuery will also be challenged by the market shift to OOXML and ODF, as the range of operations now possible with familiar productivity applications is expanding to subsume domains once the province of specialized XML authoring tools. Overall, Microsoft will be, by far, the biggest beneficiary of OOXML's market success. As the premier OOXML content creation suite, Office 2007's support for OOXML, and the new opportunities OOXML presents to Microsoft's customers, may ultimately prove to be more important to the Office business than any other new feature introduced with Office 2007.

Recommendations
This section includes several recommendations for organizations seeking to exploit emerging XML file format opportunities. Note that the recommendations are for Burton Group customers—large commercial, higher education, and government organizations—and are not meant to apply to organizations of all sizes and industries.

Distinguish Between Application/Service and Format Decisions
21
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

The often passionate debate about ODF and OOXML has, in some cases, conflated open source and open standards and has also confused the distinctions between file format and application/service alternatives. As explained in the “File Format Translators: Actual Results May Vary . . .” section of this overview, currently a many-to-many relationship exists between applications/services and file formats, and it's possible to use Microsoft Office with ODF—or OpenOffice.org with OOXML—at least for relatively simple scenarios. However, for scenarios requiring complete interoperability with Microsoft Office files—historical (binary and proprietary) formats or OOXML—organizations must use Microsoft Office. OOXML is the only XML file format that has high compatibility with earlier Office file formats, and Office 2007 is the only productivity application suite based on OOXML (although both Office 2000 and Office 2003 can read and write OOXML documents with the addition of the Microsoft Office Compatibility Pack). This may be an inconvenient truth for Microsoft competitors, but it will remain so unless Sun and other ODF supporters revise ODF to include full Office file format compatibility.

Discount the Political FUD (Fear, Uncertainty, and Doubt)
Much of the considerable amount of political posturing in the ODF/OOXML debate over the last few years has been influenced by broader objectives such as the desire, shared by many ODF supporters, to perform a “cashectomy” on Microsoft's Office-based revenue and profit streams (i.e., to significantly reduce the value of Microsoft's Office-based business). “Guilty until proven innocent” has also been a part of conventional wisdom among many Microsoft detractors, who have implied Microsoft will invariably seek to exploit opportunities to thwart open standards. This is in some cases paradoxical, since many of the same vendors work closely with Microsoft to establish industry standards in other domains. Fortunately, as noted in the “Reciprocal Benefits: Standards and Standards Procedures” section of this overview, the procedures used by organizations such as Ecma International, ISO, and (especially) the W3C offer a high degree of visibility and accountability in standards-related initiatives. If Microsoft abuses standards initiatives, the market response will be swift and severe. Similarly, if Sun continues to force its competitive agenda on OpenOffice.org and ODF activities (while stating that Microsoft will invariably engage in the type of tactics Sun is itself employing), Sun will only be hindering market potential for OpenOffice.org and ODF. Nobody should expect the February 2008 ISO ballot on OOXML to have market-altering ramifications. While ISO support for OOXML would make it simpler for Microsoft to do business with assorted world government agencies and other organizations that have mandated the use of global standards for productivity application files, OOXML will continue to gain momentum as an Ecma standard regardless of the ISO vote. Of course, ISO support for OOXML would, as a corollary, increase momentum for OOXML, and, as such, would also create new challenges for ODF supporters.

Require Vendors to Sincerely Commit and Contribute to Related Standards
Many of the players in the ODF/OOXML debate have conflicts of interest (e.g., their desire to reduce Microsoft Office profitability relative to their desire to help advance unfettered industry standards). This is inevitable, at the intersection of capitalism (e.g., selling software products) and democracy (e.g., egalitarian, community-guided standards initiatives), so it's critically important for enterprises to make their standards-related requirements and expectations clear to their software suppliers. This applies to W3C standards (such as XForms and XQuery) as well as ODF and OOXML. Eventually, if all vendors address real-world customer requirements, it should be much simpler to support ODF/OOXML interoperability. However, that's only going to happen if customers make their voices heard. In some cases, the political dimensions go beyond standards organizations. For example, if it turns out the People's Republic of China is working with Sun to leverage standards and patents as a means of effectively mandating OpenOffice.org-based productivity applications for organizations doing business in China, those organizations should contact their local government representatives to register complaints.

22
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Exploit OOXML
Any organization directly or indirectly (e.g., exchanging files with business partners) using Microsoft Office applications should plan to exploit OOXML. In addition to being the default file format in Office 2007, OOXML offers significant compression and security advantages relative to earlier, binary Microsoft Office file formats. Although moving to OOXML file formats involves some administrative challenges, the opportunities for improved content management and productivity outweigh the short-term inconvenience of migrating from binary file formats. Microsoft has invested considerable resources in easing migration challenges. Organizations should also explore options for content and application salvage operations, migrating existing Office-based content and applications to OOXML. Developers should exploit frameworks and tools that make it possible to avoid low-level OOXML details. Lowlevel OOXML development is not an optimal use of resources when mainstream tools such as Altova XMLSpy can be used at higher levels of abstraction. As noted in the “Projections” section of this overview, this recommendation is predicated on the assumption that Microsoft will continue to sincerely commit to OOXML-related standards initiatives. If Microsoft were to change its strategy, and, for example, effectively require the use of proprietary Microsoft digital rights management technologies in conjunction with OOXML file-based workflow processes, OOXML would be deeply stigmatized.

For Now, Use ODF by Exception Rather than by Default
ODF is insufficient for complex real-world enterprise requirements, and it is indirectly controlled by Sun Microsystems, despite also being an ISO standard. It's possible that IBM, Novell, and other vendors may be able to put ODF on a more customer-oriented trajectory in the future and more completely integrate it with the W3C content model, but for now ODF should be seen as more of an anti-Microsoft political statement than an objective technology selection.

Get Out of the (Traditional) Office More Often
Today, most organizations default to using traditional productivity applications in many domains that may be more effectively addressed with other types of productivity tools and services. For routine content collection and workflow processing scenarios, such as expense reporting and customer meeting note sharing, for example, it's likely that web-centric, XML forms-based solutions (including blogs and wikis) are more efficient and effective than productivity application file-based workflow alternatives. This does not exclude a Microsoft-based approach (e.g., using MOSS Excel Services or [InfoPath] Forms Services), but it also doesn't require the use of traditional Microsoft Office applications. Microsoft and its competitors—both traditional competitors, such as Adobe, IBM, Novell, and Oracle, as well as nontraditional competitors such as Google—all seek to compete by innovating with products and services designed to exploit the trends described in the “Productivity Application Market Dynamics” section of this overview. Organizations need to look beyond the ODF/OOXML debate to understand and exploit the emerging opportunities, and also need to provide guidance to information workers and developers (see, for example, the Reference Architecture technical position “Asynchronous Collaboration Alternatives”) to ensure that the right form-follows-function tools and formats are picked.

23
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

The Details
This section provides more detailed information about Extensible Markup Language (XML) productivity application file format-related concepts. Due to the complexity of OpenDocument Format (ODF) and Office Open XML (OOXML), and to the abundance of information available on both formats, this section is meant to be more illustrative than exhaustive.

An OOXML Example
Figure 4 is a screen shot of a very simple Word document, containing a Burton Group graphic and the text “Hello world!”

Figure 4: A Simple Document Example 24
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

When saved in the .doc format associated with Office 2003 and earlier releases of Microsoft Office, the document is serialized into a binary stream. Figure 5 is a screen shot of the .doc file opened in Microsoft Notepad. It's unintelligible without tools that incorporate the Microsoft file format.

Figure 5: A Simple Document Example in Notepad When saved to the .docx file format used with Word 2007, the file becomes a multipart ZIP-compressed file (i.e., if you change the file extension from .docx to .zip, it can be opened in a ZIP file utility). Figure 6 includes a view of the file package parts. Note that the specific folder and part names are associated with Word 2007 rather than the Ecma 376 standard.

25
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Figure 6: A View of .docx File Package Parts Reviewing the parts in more detail: • • • • The “_rels” part is an XML file describing the relationships between all of the parts in the package. The “docProps” part captures the document properties (again in XML). The “[Content_Types].xml” part stores a list of content types for each part inside the package. The “word” part is a directory with the contents in Figure 7.

26
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Figure 7: A View of the Word Part Contents As suggested in Figure 7, the “word” part is also multi-part, with multiple subdirectories. The “media” subdirectory, in this example, contains a single image file (see Figure 8).

Figure 8: A View of the Word/Media Part Contents Figure 9 shows the contents of the document.xml part within the “word” part. The “Hello world!” text from Figure 4 is highlighted (surrounded by a red rectangle) in Figure 9.

27
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Figure 9: A Document.xml Example Although the example in this section suggests that even simple Word documents result in very complex OOXML files, OOXML for the most part simply provides more accessible views of content that was also captured in the earlier binary file formats (i.e., the complexity was always there, because of the application domain and model concerns modern productivity applications must accommodate). Indeed, OOXML files are in most cases much smaller than their binary counterparts. Note that most application developers, when manipulating Office 2007 content, will work with tools such as Altova XMLSpy rather than Windows Explorer and NotePad. Figure 10 is a screen shot of XMLSpy working with Office 2007 content.

28
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Figure 10: XMLSpy Example (Source: Altova)

An ODF Example
Figure 11 is a view of an ODF document, created with OpenOffice.org (2.3) Writer. As with the OOXML example, the .odt file depicted in Figure 11 is a multipart ZIP file.

29
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Figure 11: An OpenOffice.org Writer Example Reviewing the file parts in more detail: • The “Configurations2” part describes user interface elements such as the menu bar, tool bar, and status bar. • The “”META-INF” part contains a single file, “manifest.xml,” which is an index of all of the other parts in the file. • The “Pictures” directory contains one file per image used in the document. • The “Thumbnails” directory contains a thumbnail image of the first page of the document. • The “meta” part includes document metadata such as creation date and editing duration. • The “mimetype” file contains a single line of text identifying the Multipurpose Internet Mail Extensions ( MIME) type for the document (“application/vnd.oasis.opendocument.text,” in this example). • The “settings” part contains metadata such as window size and position and whether the ruler is visible. • The “styles” file contains details about the styles used in the document (separating content from presentation, as does the OOXML “styles.xml” part in Figure 8). • The “content” part is an XML file with the document content, which is excerpted in Figure 12.

30
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Figure 12: An OpenOffice.org Writer Example Comparing the XML examples in Figures 9 (OOXML) and 12 (ODF), it's clear that the OOXML markup is more cryptic and harder to understand, a subject of extensive criticism from the ODF community. Microsoft has stated it was a deliberate choice to help reduce the size of OOXML files, but as a general rule, an ODF version of a document is slightly smaller than its OOXML counterpart.2

OOXML Resources
Figure 13 depicts relationships among the major components of OOXML.

31
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Figure 13: Ecma International OOXML Components (Source: Open XML Explained) The four primary OOXML components shown in Figure 13 are: • Core technologies, including the ZIP compression scheme, XML, and Unicode (for global content) • The Open Packaging Convention (OPC), a file format defined in terms of packages and parts. • Shared vocabularies for elements used in multiple productivity application models (i.e., in documents, spreadsheets, and presentations), including drawings, metadata elements, equations, and bibliographies. Vector Markup Language (VML) is accommodated as a legacy format because of its use within Microsoft Office, but VML is obsolete (it has been supplanted by SVG). • Markup languages that are application model specific for documents (WordprocessingML), spreadsheets (SpreadsheetML), and presentations (PresentationML). Some resources for additional OOXML information include: • The Ecma International Standard ECMA-376 site, which includes the five-part OOXML specification • “Office Open XML Overview,” an Ecma overview written by Tom Ngo (of NextPage) • OpenXMLCommunity.org, a Microsoft-managed resource including Microsoft's assessments of several “Open XML myths” • OpenXMLDeveloper.org, a Microsoft-created community site for OOXML developers • Brian Jones' blog: Brian Jones is a Microsoft program manager focused on OOXML • “Custom schemas revisited”: A blog post by Microsoft employee Doug Mahugh explaining the OOXML custom schema options • The Microsoft Office Compatibility Pack, which describes Microsoft's adapters for making earlier releases of Office compatible with OOXML files. • Open XML Explained, a free book by Wouter Van Vugt • Altova's Open XML resource page

32
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

ODF Resources
ODF has an architecture similar to the OOXML architecture depicted in Figure 13, again starting with ZIP, XML, and Unicode. ODF also specifies markup languages for documents, spreadsheets, presentations, and common elements such as drawings and equations, but its markup languages are simpler and less expressive than their OOXML counterparts. Figure 14 depicts the primary ODF elements.

33
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Figure 14: ODF Elements (Source: Sun Microsystems) Some resources for additional ODF information include: • • • • • • • • • The OASIS Open Document Format for Office Applications (OpenDocument) TC site The OASIS ODF v1.1 Specification The ODF Alliance, a community resource site for ODF developers OASIS OpenDocument Essentials: Using OASIS OpenDocument XML, a free ODF book by J. David Eisenberg http://www.robweir.com/blog/ http://consortiuminfo.org/standardsblog/ http://opendocument.xml.org http://wiki.services.openoffice.org/wiki/Market_Share_Analysis http://wiki.services.openoffice.org/wiki/OpenOffice.org_Solutions

34
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Conclusion
The OpenDocument Format (ODF)/Office Open XML (OOXML) debate is part of a significant phase in the evolution of productivity application, with the shift to Extensible Markup Language (XML) file formats displacing traditional binary and proprietary file formats. The stakes are huge, with compelling new opportunities for content management, as well as both opportunities and challenges for software vendors. Organizations will gain important benefits by exploiting opportunities to improve information management and reduce vendor dependencies by shifting to XML file formats.

35
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Notes
1 Countries that have mandated the use of ODF by government agencies include Norway and the Netherlands. 2 For example, this document's different file sizes are: • 1.86 MB: .DOC, Microsoft binary file format created by Microsoft Word 2003 • 1.64 MB: .DOCX, OOXML file format created by Microsoft Word 2003 • 1.57 MB: .ODT, ODF file format created by IBM Lotus Symphony Documents.

36
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com

Author Bio
Peter O'Kelly Research Director Emphasis: Communication/collaboration workspaces, applications, and tools; XML and document/database management system convergence, and extensive experience with IBM and Microsoft communication/collaboration products Background: 25 years of software career focused on application architectures, XML and database topics, and collaborative applications. Enterprise application developer and data modeler. Software product management, business development, and strategy roles with companies including IBM/Lotus, Groove, and Macromedia. Analyst/author of dozens of reports and articles on Microsoft platform and product strategy, XML standards, DBMS market dynamics, and other topics. Consultant to enterprises and software vendors on topics including application architecture, data/conceptual modeling, and product/company strategy. Primary Distinctions: Led IBM Lotus Notes product management during the pivotal Notes R4 release. Established product management and competitive strategy for collaborative application pioneer Groove Networks. Macromedia VP Strategy during the launch of its MX product line. Frequent speaker at many major industry events such as Lotusphere and Enterprise 2.0. He is continually quoted and looked to for insight regarding major movements in collaboration and communication fields. Avid blogger since early 2000 (<a href="http://pbokelly.blogspot.com/" target="blank">http://pbokelly.blogspot.com/</a>).

37
BURTON GROUP 7090 Union Park Center Suite 200 Midvale · Utah 84047 · P 801.566.2880 · F 801.566.3611 · www.burtongroup.com