[LINK] Office Open XML Ecma Standard

Tom Worthington Tom.Worthington at tomw.net.au
Mon Dec 11 08:53:16 AEDT 2006


On 7 December 2006 
http://www.ecma-international.org/news/PressReleases/PR_TC45_Dec2006.htm 
, Office Open XML (OpenXML) was adopted as Ecma standard 376. Ecma 
has also submitted it for fast track adoption by ISO (IEC JTC 1).

OpenXML is based on Microsoft's Office Open XML and is an adaption of 
Micrsoft Office's word-processing , presentation, and spreadsheet 
formats to XML. It is similar to Sun Microsoft's OpenOffice format, 
which has already been adopted as an ISO standard. There is a useful 
overview of Office Open XML
and comparison with OpenOffice in the Wikipedia 
<http://en.wikipedia.org/wiki/Microsoft_Office_Open_XML>.

Unfortunately while ECMA's announcement says their documents can be 
downloaded from their web site, I was unable to find the approved 
standard 376 in the list 
<http://www.ecma-international.org/publications/standards/Standard.htm>. 
But presuimbly the standards is close to the final draft of 9 October 
2006 
<http://www.ecma-international.org/news/TC45_current_work/TC45-2006-50_final_draft.htm>. 
In addition there is a overview by the ECMA committee 
<http://www.ecma-international.org/news/TC45_current_work/OpenXML%20White%20Paper.pdf>.

The standard is divided into five parts:

     Part 1 - Fundamentals
     Part 2 - Open Packaging Conventions
     Part 3 - Primer
     Part 4 - Markup Language Reference
     Part 5 - Markup Compatibility and Extensibility

The standard is provided in PDF Tagged PDF and "WordprocessingML" 
formats (WordprocessingML is the OpenXML word processing format). The 
document is not provided in HTML format as ordinary web pages, which 
will severely limit access to it.

Like OpenOffice, OpenXML uses the zip format to bundle up the text of 
a document in XML format with any images and other binary files into 
a compressed file. As an example the "Fundamentals" section of the 
standard in OpenXML format is one 240 kbyte ziped file 
<http://www.ecma-international.org/news/TC45_current_work/Office%20Open%20XML%20Part%201%20-%20Fundamentals_final.docx>. 
When unzipped it contains 29 files, of a total of 2.4 mbytes: three 
PNG images and the rest XML. The main text of the document is in one 
1.6 mbytes file ("document.xml"), with various formatting and 
references in other small files.

Assuming the IT community accept Microsoft's assurances that they 
will continue to make use of the format freely available, it should 
prove popular. However, neither OpenXML nor OpenOffice are compatible 
with a web browser and face their biggest challenge from web 
standards. After an author prepares a document using OpenXML or 
OpenOffice they most likely then have to render it other formats for 
distribution, such as PDF and HTML.

Newer XHTML standards are providing more of the formatting expected 
for word processing documents, while providing backward compatibility 
with web browsers. A word processor which use an XHTML format as its 
native format would provide the capability of simply saving the 
document to the web for distribution. There would be no need to 
convert to PDF or HTML. There would also be scope for better 
integration with web tools, such as blogs, wikis and feeds.

The creation, promotion and distribution of a new word processing 
package was previously a major undertaking. However, AJAX (Web 2) 
based office packages could quickly render irrelevant the debate as 
to if OpenXML or OpenOffice is better, by superseding them both.

Ecma's overview of OpenXML illustrates both the strengths and 
weakness of both its approach and that of OpenOffice:

"OpenXML was designed from the start to be capable of faithfully 
representing the pre-existing corpus of word-processing documents, 
presentations, and spreadsheets that are encoded in binary formats 
defined by Microsoft Corporation. The standardization process 
consisted of mirroring in XML the capabilities required to represent 
the existing corpus, extending them, providing detailed 
documentation, and enabling interoperability. At the time of writing, 
more than 400 million users generate documents in the binary formats, 
with estimates exceeding 40 billion documents and billions more being 
created each year."

This is a wrong headed approach to the creation of an electronic 
document standard.  The priority for word processing documents has 
been to reliably produce printed documents which look identical. 
However, the production of printed documents is now a very small part 
of what a word processor is used for and should not be the priority. 
Most documents are used for on-screen electronic viewing. Exact 
reproduction of a printed format is exactly what is NOT needed. As a 
result word processing documents have to be converted into other 
formats for use. As an example, the OpenXML standard is provided in 
three formats: PDF for printing, Tagged PDF for on-screen viewing and 
WordprocessingML. None of these formats is particularly suitable for 
on-screen viewing.

A new approach is needed where the document format is designed for 
on-screen viewing with a web browser, and then the additional 
features needed for printing are added. This can be done with XHTML.

Blog version at: <>.



Tom Worthington FACS HLM tom.worthington at tomw.net.au Ph: 0419 496150
Director, Tomw Communications Pty Ltd            ABN: 17 088 714 309
PO Box 13, Belconnen ACT 2617                http://www.tomw.net.au/
Visiting Fellow, ANU      Blog: http://www.tomw.net.au/blog/atom.xml  




More information about the Link mailing list