File Format: Hidden traps in OpenDocument (or any other open standard) and how to avoid them

(Note: this post is an excerpt and a follow-up of an article published in December 2006 in the monographic issue on the OpenDocument Format by Upgrade, the online version of the Spanish magazine Novatica. The whole monography can be read online)

It is almost sure that, eventually, all major producers of both proprietary and Free software in the office files space will support OpenDocument. In and by itself, however, that standard is open to several ways to keep monopolies possible, or to nullify its usefulness for long term archiving.

Technically speaking, OpenDocument is very powerful and useful because it can be extended. The standard doesn’t mandate, however, nor it should, that all extensions are licensed in the same way as the standard itself. Even ignoring future extensions, the standard as it is today has plenty of backdoors for proprietary traps. Some examples are (see the full Novatica article for details):

  • digital signatures
  • macros
  • embedded images, audio or any other multimedia object embedded in texts spreadsheets and presentations
  • in-file databases

Objects of this kind can be placed inside an OpenDocument file even if their format is accessible only through patent-covered or otherwise proprietary software. Nothing in the OpenDocument specification prevents this (and, again, it shouldn’t!).

The practical consequence is that it is possible to have a perfectly Free as in Freedom XML container which is full of patent-ridden components. A container, in other words, which is culturally, economically and politically useless to guarantee long term preservation of information, public ownership of public documents or a really free market in the software industry.

If anything, the fact that an office file standard is not owned and controlled by one vendor may make it even easier, not harder, than proprietary extensions appear to keep end users locked in, at least in some scenarios.

Does this mean that OpenDocument is useless?

Not at all. Personally, I am still convinced that OpenDocument is by far the best possible solution for a very serious problem. To the best of my knowledge, OpenXML is still much worse than OpenDocument both in terms of feasible support in third party applications and in terms of space left to reinventing the wheel and unnecessary proprietary extensions . For these reasons, I remain convinced that it is necessary, at least for creation of new public documents, to just say no to OpenXml (available to unregistered users by the end of April).

At the same time, I am convinced that it is necessary to stop, at least in the public sector, to just “switch to OpenDocument” and feel happy about it without looking behind the corner. I believe that further steps need to be taken, steps which, by the way, are not specific to OpenDocument.

What is the right solution?

OK, so “100% OpenDocument (or OpenXML) Compliance” isn’t enough to guarantee that an OpenDocument report or law proposal stored today will be completely readable and usable 20 or more years from now. The real solution, however, is not a technical one. Technical ways to apply it once it exists are available, and they are mentioned in my Novatica article.

This said, this is not a format specification issue. When present, technical extensibility of a standard is (and must remain) neutral with respect to intentions. It would be very inefficient, if not plain wrong, to place specifications of a legal nature inside what must remain purely technical documents.

What I believe to be necessary is to establish and enforce:

  • in the first place, some official “OpenFile” trademark or equivalent label which can be legally applied only to files in which no component has restrictive licenses or uncomplete documentation
  • immediately after that, laws requiring that OpenDocument files can be stored by, or exchanged with public Administrations, libraries and so only if they carry this “OpenFile” seal. Exceptions to this rule should be temporarily granted only in really exceptional cases, when there really is no alternative

What do you think? Are these conditions enough? Who should define the “label”? Governments or standard bodies? Who should enforce its usage? Which exceptions could or should be tolerated? Please let me know: I am very interested to hear your opinion and to participate in any future discussions on these issues!

(Thanks to Roberto for suggesting that I write this post and for hosting it!)