Squaring Off

Despite heated rhetoric over open document formats, there's not much to choose between ODF and Office Open XML

In 2004, the State Department decreed that its official font henceforth would be Times New Roman, a more modern-looking design than its previous official font, the typewriter-esque Courier New. But in updating the look of its correspondence, the agency was also unintentionally locking itself into one vendor for desktop office productivity software, said Red Hat legal counsel Mark Webbink.

Times New Roman, a variant of Times Roman tweaked for computer viewing, is a proprietary font owned by Microsoft. So rendering the true look and feel of State documents would require software from Microsoft or other parties that license the font from Microsoft.

In short, those who don't want to purchase one of these Microsoft-blessed products can't view a document, at least not as it looked to the official who created it. Could State be charged with favoring one vendor? Microsoft competitor Red Hat certainly thinks so.

Such are the perils government agencies face with electronic recordkeeping. Something seemingly as simple as banging out a public memo in Word can raise a number of provoking questions, especially when that memo might be of interest to people generations from now or even people around now who may not have the same software you do.

Governments 'generate documents that are relevant to citizens and historians perpetually,' said Boston standards lawyer Andrew Updegrove. 'And the history to date leads one to conclude that, for the last 20 years, existing [electronic recordkeeping] systems haven't worked, because there are lots of documents around that are no longer accessible.'

Unlike using pen and paper, using computer software to craft a government document makes an assumption about the software needed to view that document. And many feel government should not be making this assumption.

When he served as co-chair for the CIO Council's XML Community of Practice, Owen Ambur said he had seen requests for information issued by agencies as Word documents. 'To me, that is inexcusable. They are doing that for their own convenience,' he said at one of the group's meetings.

'This is a question of sovereignty of information,' said Bob Sutor, manager of standards at IBM. 'To what degree should a government give control of their information to a vendor? And my answer would be, none.'

Thus far, there are two standards for formats that purportedly tackle this problem, at least for office documents. When you write documents in either of these formats, they are saved in plain ASCII text with the human-readable Extensible Markup Language specifying the look and feel of the documents. Rendering documents this way ensures ' in theory anyway ' that they can be read even when the software that created them is not available.

One of these formats is the Open Document Format for Office Applications (ODF). It was created by vendors and volunteers under the auspices of the Organization for the Advancement of Structured Information Standards and is supported by a variety of comparatively little-used office productivity suites, most notably the open-source Open Office.

The other format is Microsoft's more newly defined Office Open XML (OOXML), which ' in all likelihood ' only will be used within Microsoft's Office productivity suite as the default format for saving documents.

Even in a field known for hype and rhetoric, the debate between proponents of ODF and OOXML is heated, despite ' or maybe because of ' the minimal differences between them. 'When it comes to issues of long preservation, there is not much of a difference,' said Laurent Lachal, senior analyst in charge of open-source research at Ovum.

Office space

Just as the Boston Tea Party sparked the American Revolution, so too did Massachusetts' proposed adoption of ODF set off a fierce debate about open-standard office document formats.

In September 2005, the state issued an Enterprise Technical Reference Model that called for using open standards for data retention and exchange. For office documents, it called for use of ODF.

At the time, ODF was about to undergo ratification by the International Standards Organization, a global federation for validating standards. It was subsequently adopted as the default file format for Sun Microsystems' Star Office and its open-source equivalent, Open Office. The K Desktop Environment Project's Koffice open-source office suite, IBM's Workplace Managed Client collaboration suite and Corel's WordPerfect all support ODF. Microsoft Office 2003 and 2007 do now, too, via plug-ins.

Massachusetts' Chief Information Officer Peter Quinn indicated two reasons behind the move to ODF: It would allow the state to better keep permanent records, and it would allow documents to be shared across more applications more easily.

The action, however, raised considerable controversy. For office productivity tasks, most of Massachusetts' agencies used a version of Microsoft Office, which at the time did not offer the ability to save documents in ODF. Consequently, Microsoft executives saw the decision to use ODF as a move to lock Microsoft out of the market. The company lobbied the state legislature to remove the mandate for ODF, and the ensuing back-and-forth proved too much for Quinn, who left the office in 2006.

Since then, various other governments have proposed adoption of ODF, with varying degrees of success. In May, New York state introduced a bill to study the feasibility of using open-document formats. Florida, California, Connecticut, Oregon and Texas introduced similar bills, all of which were killed or stalled in committee. Only Minnesota successfully passed into law a requirement for the state's information tehnology department to look into using ODF, downgraded from a mandate to actually use it.

Microsoft document formats haven't remained stagnant, either. With the release of Office 2007, the company introduced Office Open XML, its own set of XML-based file formats, as the default formats for documents produced by the application suite.

And like ODF, OOXML has been approved by Ecma International as a standard and is in the process of getting ISO approval. Tom Robertson, general manager of interoperability and standards at Microsoft, said it was user requests that pushed Microsoft to send the Office formats through the standardization process.

The release and standardization of Office Open XML is a major challenge for ODF proponents. ODF's biggest restraint was the fact it was not supported by the dominant office productivity suite ' Microsoft Office. So ODF adoption necessitated a complex switch in software. But at least ODF did have a lock on the vendor neutrality issue. But now that OOXML serves the same need and already comes with Microsoft Office ' the new version, anyway ' is there a need for ODF at all?

Apples and...apples?

There are differences between ODF and OOXML, but advocates on both side of the debate usually admit the differences are not essential, at least not when it comes to interoperability and data preservation.

People in the ODF camp certainly have plenty to say about the shortcomings of OOXML. In a white paper posted last month, Sam Hiser, founder of the OpenDocument Foundation, noted several areas where OOXML fell short.

For instance, Hiser said that even though Ecma oversees OOXML, Microsoft retains control over the standard. This means that no changes can be made without Microsoft's buy-in.

OOXML 'provides a facade of openness,' Sutor said. 'Whatever happens within Ecma must be 100 percent compatible' with Microsoft Office. It seems exceedingly odd to say that here is this open effort that you people must work on, but you must contain total compatibility with my product.'

Robertson said that the Ecma process is a collective one. 'The keys to the standard are now in the Ecma community,' he responded.

The GrokLaw Web site, a legal discussion forum overseen by Pamela Jones, is also compiling a list of OOXML shortcomings. Among the perceived failings is that OOXML contradicts established standards for rendering basic units of measurement such as dates, graphics and mathematical formulas. Another complaint: Sections of the standard rely on non-XML formatting codes, making pieces of a document unreadable by XML parsers.

Sutor echoed this concern. 'I'm concerned that there will be things [in the standard] that will link OOXML specifically to applications within Windows. If I were a CIO, I'd say you must not do anything in your documents that ties it to one platform.'

Microsoft denies that there are areas of OOXML that are not based on XML. But there are OOXML elements that might not be supported by other software or other formats, such as ODF. The company helped fund a project to translate documents from OOXML to ODF, but found ODF could not render all the aspects of Office documents. 'It is possible for us to export all the components of XML, but there is the question of whether they could be transferred into ODF. There has to be a two-way understanding,' Robertson said.

'All the features in Office get expressed in OOXML. The problem is once you have this in Open XML, how much can you translate into ODF ' and that is where some features can't be translated, or translated in full fidelity,' added Jean Paoli, Microsoft's general manager of interoperability and XML architecture.

Likewise, ODF has been accused of its own shortcomings. For one, it lacks standard ways of rendering some widely used, advanced features in documents, such as macros, tables and symbols for mathematical expressions.

For spreadsheets, there is no uniform way to enter formulas. As developer Morten Welinder said in his blog, ODF 'completely ignores semantics of spreadsheets, thus virtually ensuring that if two implementations of the standard existed, then they could not use each other's spreadsheets. That, in turn, renders the alleged standard pointless for spreadsheets.'

In fact, some have noted that the ODF standard itself is way too short to be of much use for vendors trying to build their own ODF software. The OOXML specification is more than 6,000 pages long; ODF's contains only 722 pages.

ODF is not well-enough defined to support applications that can be fully interoperable, said Miguel de Icaza, developer of the Gnome Linux desktop and the Novell Mono project. As an example, he said OOXML devotes 324 pages to documentation of formulas and functions, and ODF has around 10 pages devoted to the subject.

'There is no way you could build spreadsheet software based on this specification,' de Icaza wrote on his blog. 'To build a spreadsheet program based on ODF, you would have to resort to an existing implementation source code...or you would have to resort to Microsoft's public documentation, or, ironically, to the OOXML specification.'

Overall, though, many feel these are only minor shortcomings on each side. 'No specification will be 100 percent accurate when it comes to implementation,' Lachal said. In other words, both ODF and OOXML have shortcomings in dealing with issues such as formatting date. But both do what they set out to do with fairly equal success.

So is there much difference between the two? 'There really isn't,' Updegrove said, adding that the true difference is the effect that each standard could have in the marketplace. ODF has less detail, but in turn it 'offers a lot of flexibility to create a lot of products, which enables head-to-head competition.' As a result, it can encourage competition. OOXML, on the other hand, is targeted more at reproducing Office documents. 'Microsoft does not intend nor want nor permit anyone to clone Office,' he said. 'It allows a greater degree of utility to end users to work with Office. It makes it easier for other products to work with Office.'

Microsoft has agreed that the two formats are different. 'ODF and OOXML were designed with different purposes in mind,' Robertson said. 'OOXML was designed to map to the feature set in Office 2007 and to be backward-compatible with documents created in the binary format.' In contrast, ODF was designed to be more like a general office document markup, the way HTML is general markup for Web pages. 'This is not to say one necessarily is better than the other. They were designed for different purposes, and they will be used in different ways.'

But Updegrove says encouraging competition in the marketplace may not be one of the government's chief goals, though he feels it is one worth pursing.

When it comes to document readability, the ODF and OOXML approaches both are valid, Lachal said. The key is that they are both open standards. 'In the grand scheme of things, the technical differences will not make that much of a difference.'

Reader Comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above