OpenOffice.org Italian Association: Final comments to the proposed Microsoft Office Open XML Standard

Trieste, 17th of July 2007 – The Association PLIO has deeply analyzed the Microsoft Office Open XML standard, and reading the over 6000 pages PLIO’s experts believe that the format should be substantially revised before being approved as standard.PLIO Association really appreciates the effort and the commitment of Microsoft, in relation to the declared availability to create a task force for the development of a reference implementation for OOXML. Anyway, this implicitly admits that the reference implementation is missing, and this creates a problem for any OOXML would-be implementor other than Microsoft itself.

If the proposed OOXML file format will follow the ISO standard track in order to address problems which are still open, the PLIO Association is interested in becoming a member of the OOOXML reference implementation task force.

Associazione PLIO: final comments to the proposed
Microsoft Office Open XML (OOXML) standard file format

The document is arranged in sections:

  • Availability of a reference implementation

  • IPR

  • Issues relating to operating system and application dependences

  • Issues relating to OOXML options specific comments, that have relevance also with IPR

  • Non conformances with numerous international standards

Reference implementation

Associazione PLIO really appreciates the effort and the commitment of Microsoft, in relation to the declared availability to create a task force for the development of a reference implementation for OOXML. Anyway, this implicitly admits that the reference implementation is missing, and this creates a problem for any OOXML would-be implementor other than Microsoft itself.

If the proposed OOXML file format will follow the ISO standard track in order to address problems which are still open, the Associazione PLIO is interested in becoming a member of the OOOXML reference implementation task force.

IPR

As a comment, ISO/IEC directives, part 1, 2.14 “reference to patented items” mentions: “2.14.2 If technical reasons justify the preparation of a document in terms which include the use of items covered by patent rights, the following procedures shall be complied with:

a) The originator of a proposal for a document shall draw the attention of the committee to any patent rights of which the originator is aware and considers to cover any item of the proposal. Any party involved in the preparation of a document shall draw the attention of the committee to any patent rights of which it becomes aware during any stage in the development of the document”.

Apart from the fact that patented items should be accepted only in exceptional situations (see 2.14.1, “If, in exceptional situations, technical reasons justify such a step”) there are several mandatory part of the standard that are not covered in a descriptive way within the document text but only referenced; and such parts are not explicitly covered by the Microsoft patent agreement, or any published notes or guidance on Microsoft patents referenced in the document.

The fact that Microsoft claims potential patent infringement in public events is sufficient to raise the activation of clause 2.14.2(a), and require an explicit reference to covered patents that may be infringed by an implementation of the proposed standard (that should be explicitly mentioned in the text). Microsoft should therefore prepare a statement that shows that the submitter is willing to enter into a RAND agreement within the scope of the intellectual property potentially covering the proposed standard.

The fact that the Microsoft OCP says “No other rights except those expressly stated in this covenant shall be deemed granted, waived or received by implication or estoppal or otherwise” is not compatible with the ISO directives, and should be amended or integrated within the proposed standard. Thus, to be eligible for the ISO evaluation process, the submitter should include an explicit list of potentially infringing patents and include a letter that shows willingness to perform RAND licensing for the covered IPR.

The clarification issued by Microsoft on this subject, unfortunately, increases the perplexity instead of reducing it.

As pointed out by many sources, the mainly problematic part of the OSP is:

“Microsoft Necessary Claims” are those claims of Microsoft-owned or Microsoft-controlled patents (…). Which are those patents? Is there a tentative good-faith list of them? One would expect that, being the standard conceived by Microsoft, that they know which of their patents apply, and either they claim those patent – provided that they promise not to sue anybody, as required by the standard – or that by not claiming them, they cannot later sue implementers for infringement. By not identifying the patents, Microsoft leaves room to uncertainty, especially in the light of what follows.

(…) that are necessary to implement only the required portions of the Covered specifications (…): this excludes the permitted values, which are valid values for an implementation’s format. A standard can well either mandate or permit values, but it must make sure that the permitted values are – as much as possible – available to all implementations on equal, reasonable and non discriminatory terms, or better, a standard, deprecating those which do not meet certain requirements. Here Microsoft in its reply seems to mix up the need to include the representation of external graphic files with the possibility to include binary data types, including those deriving from binary Microsoft Office formats.

(…) and not merely referenced in such Specification: therefore only the mandated data types are covered. If a given implementation of the standard produces a OOXML-compliant document containing a merely referenced data type, such as a data type replicating an old Microsoft Word file (.doc), that data type would not be “mandated”, i.e., another OOXML implementation would not be required to use that data structure. Thus that is not covered by the specification.

This was roughly the matter of concern of the critics to Microsoft covenant not to sue. Now Microsoft has issued this clarification.

We welcome the clarification that also a partial implementation would enjoy from the covenant not to sue. Actually the wording of the promise does not contain clear references to this. Perhaps, Microsoft should amend the filed document accordingly, and not simply issuing a clarification whose binding effect remains still untested.

However, the clarifications do not address the concerns above. In the second-last and third-last paragraph, the ambiguity is quite large. Microsoft asserts that pre Office Open XML binary file formats are a possible value that a component could have, and not a mandatory value. This remains an obstacle to the implementation of ECMA 376 as a standard, because it creates asymmetry: the standard permits to a conforming implementation to write a file which another conforming implementation would not be legally allowed to correctly read and write, because the data type is outside the mandatory part of the Specification. This requires an additional permission, which is ostensibly available to all who request the further specification for the binary file. In other words, the permission is not contained in the standard documents, but must be requested separately, which contradicts with the rules for standardization.

In practice, we have tested the system that Microsoft implements, and we did not receive what we were requesting, i.e., “the binary file formats specifications both for Office 2000/XP and Office 2003 applications”, but only the Office 2007 specifications.

It is still our opinion that, by permitting large parts of binary formats not covered by the promise not to sue and not fully documented, which invariably will be used by the dominant Office application, the standard will fall way short of its main goal of being a truly multi-vendor standard.

Operating system and application dependences

OOXML defines a ST_CF type21, which records the allowed clipboard formats which may be used with a graphical object. The allowed values of this type, EMF, WMF, etc., are all built-in proprietary Windows formats. No allowance has been made for use by other operating systems. For example, in Linux images are typically copied on the clipboard in an open standard format like PNG. But if a vendor encodes “PNG” into a document record of this type, the document will be invalid, and the document and the application will not conform to the OOXML specification.

The “optimizeForBrowser” element of WordProcessingML23 has been defined in a way which ignores the existence of current browsers other than Internet Explorer. It’s not possible to set other browsers as target browsers. This section in OOXML requires that “all settings which are not compatible with the target web browser shall be disabled”.

What if I want my application to produce standards-compliant output? So yes to PNG, no to VML, yes to Math Ml and SVG? A would-be implementor is not able to specify this with the way OOXML has been designed.

OOXML recommends that print settings (number of pages to print, which pages to print, orientation, print quality, etc.) be stored in a platform-specific binary format. For example on Windows their guidance is to store in what is called the “DEVMODE” structure. Doing so would render the print settings platform dependent and prevent interoperability.

“For legacy reasons, an implementation using the 1900 date base system shall treat 1900 as though it was a leap year… A consequence of this is that for dates between January 1 and February 28, WEEKDAY shall return a value for the day immediately prior to the correct day, so that the (non-existent) date February 29 has a day-of-the-week that immediately follows that of February 28, and immediately precedes that of March 1”. These bug, according to the OOXML text itself, corresponds to the same bug contained into Excel. The result is that all would-be implementors of OOXML are required to have their applications give their users incorrect answers to questions like “What day of the week is February 1st, 1900?”, creating interoperability problems with relational DBMS which uses Gregorian calendar.

The binary part referred in Part 1 – 12.3.5 is said to be used for the storage of “arbitrary user-defined data”. No further detail is given as to what user action would trigger the use of this “user-defined” data. Without further definition, no interoperability of this feature is possible.

OOXML describes how to attach a Quick Time video to a presentation object. No description of the Quick Time format is provided. Without specifying a version and supported codecs, there will be no interoperability.

Options specific comments

OOXML allows implementations to insert content in alternate file formats such as RTF. RTF is Microsoft proprietary format. Microsoft can support old binary documents simply by embedding the RTF content. But other implementors cannot reliably support those documents because the specification for RTF is not standardized and not included in OOXML.

Is this recommending that a non-public, internal only, work-for-hire application author create “publicly available documentation” on what subset of the standard it supports? The business relationship between the software author and his customer should not be a concern of this standard.

The same applies to MHTML and the others MS Office formats.

This requires that a conforming OOXML consumer also be able to understand a specified list of other document formats, including proprietary ones such as MHTML and RTF, and for conforming producers to understand how to convert these formats to OOXML.

Several elements are not required by the standard, but if omitted lead to “application-defined” default behaviors – a completely unnecessary barrier to interchange between applications (causing the same document with “default” styles to appear completely different in two conforming programs), as opposed to simply defining the defaults in the standard.

For example, Part 4 – section 2.7.4 defines elements to specify default paragraph and run properties (docDefaults, pPr, pPrDefault, rPr, and rPrDefault). If these are omitted “the defaults are therefore application-defined”.

A series of definitions are intrinsically based upon material (like autoSpaceLikeWord95, lineWrapLikeWord6, shapeLayoutLikeWW8, and many other features) that are not part of the OOXML submission, and that are not part of any known standard either.

Non conformances with several international standards

Ecma 376 contradicts several international standards:

  • The Gregorian Calendar: section 3.17.4.1 page 3305, “Date Representation”, conflicts with the Gregorian calendar in the calculation of dates. Specifically, it requires spreadsheet implementations to incorrectly handle the year 1900 as a leap year. This contradicts the Gregorian calendar, ISO 8601 and the civil calendar adopted by most nations of the world.

  • ISO 8601 (Representation of dates and times): section 3.17.4.1 page 3305, “Date Representation” stipulates that dates must be represented as numeric codes counting from 1900 or 1904. This is in conflict with ISO 8601. This section also forbids applications from supporting years before 1900, also in conflict with ISO 8601.

  • ISO 639 (Codes for the Representation of Names and Languages): section 2.18.52 page 2530, ST_LangCode requires the use of a fixed list of numeric language codes rather than the already existing set provided by ISO 639. This is a conflict with ISO 639. The codes standardized by ISO 639 include the use of a Registration Authority to process requests for new language codes. This is preferable to a fixed list attached to a document standard.

  • ISO/IEC 8632 (Computer Graphics Metafile): section 6.2.3.17 page 5679, “Embedded Object Alternate Image Requests Types” and section 6.4.3.1 page 5738, “Clipboard Format Types” refer to Windows Metafiles or Enhanced Metafiles instead of using ISO/IEC 8632 or W3C SVG.

  • ISO/IEC 26300:2006 (OpenDocument Format for Office Applications): duplicates the functionality of the existing OpenDocument standard as its core purpose is to support text documents, spreadsheets, drawings and presentations for office applications.

  • W3C SVG (Scalable Vector Graphics): section 14 page 132, “DrawingML” defines a vector drawing XML format in conflict with the industry standard W3C SVG. Section 8.6.2 page 24, “VML”, requires support for another drawing XML format in conflict with W3C SVG. Note that VML was proposed by Microsoft as a W3C standard in 1998, but was rejected in favour of SVG.

  • W3C MathML (Mathematical Markup Language): section 7.1 “Math” (page 747) covers mathematical expressions, and defines a format in conflict and incompatible with the W3C Recommendation MathML.

  • ISO/IEC 10118-3, W3C XML-ENC, and other cryptographic hash standards: ignores accepted standards for cryptographic hashes and defies expert standards for cryptography by proposing its own hash algorithms. Internation stardards and related algorithms that have been ignored include ISO 10118-3, SHA1, SHA256, SHA384, SHA512, RIPEMD-160, MD5 which are proposed by organizations like W3C, NESSIE, NIST, CRYPTREC.

  • W3C SMIL (Synchronized Multimedia Integration Language): section 4.4 “Animation” (page 565) covers presentation animations (slide transitions), in conflict with the W3C Recommendation SMIL.

PLIO, the OpenOffice.org Italian Native-Lang Project, is the Italian community of volunteers who develop, support and promote the open-source office productivity suite, OpenOffice.org. OpenOffice.org supports the Open Document Format for Office applications (standard ISO/IEC 26300) and is available on major computing platforms in over 90 languages, available to 90% of the world-wide population in their own mother tongue.
OpenOffice.org is provided under the GNU Lesser General Public Licence (LGPL), can be legally used in any context.

PLIO, Progetto Linguistico Italiano OpenOffice.org:
http://it.openoffice.org
“Vola e fai volare con i gabbiani di OpenOffice.org: usalo, copialo e regalalo, è legale!”
For further information: Italo Vignoli (+39.348.5653829), stampa@openoffice.org

Technorati Tags: PLIO, OpenOffice, OpenXML, OOXML, ISO