[LINK] Office Open XML Ecma Standard
Tom Worthington
Tom.Worthington at tomw.net.au
Mon Dec 11 08:53:16 AEDT 2006
On 7 December 2006
http://www.ecma-international.org/news/PressReleases/PR_TC45_Dec2006.htm
, Office Open XML (OpenXML) was adopted as Ecma standard 376. Ecma
has also submitted it for fast track adoption by ISO (IEC JTC 1).
OpenXML is based on Microsoft's Office Open XML and is an adaption of
Micrsoft Office's word-processing , presentation, and spreadsheet
formats to XML. It is similar to Sun Microsoft's OpenOffice format,
which has already been adopted as an ISO standard. There is a useful
overview of Office Open XML
and comparison with OpenOffice in the Wikipedia
<http://en.wikipedia.org/wiki/Microsoft_Office_Open_XML>.
Unfortunately while ECMA's announcement says their documents can be
downloaded from their web site, I was unable to find the approved
standard 376 in the list
<http://www.ecma-international.org/publications/standards/Standard.htm>.
But presuimbly the standards is close to the final draft of 9 October
2006
<http://www.ecma-international.org/news/TC45_current_work/TC45-2006-50_final_draft.htm>.
In addition there is a overview by the ECMA committee
<http://www.ecma-international.org/news/TC45_current_work/OpenXML%20White%20Paper.pdf>.
The standard is divided into five parts:
Part 1 - Fundamentals
Part 2 - Open Packaging Conventions
Part 3 - Primer
Part 4 - Markup Language Reference
Part 5 - Markup Compatibility and Extensibility
The standard is provided in PDF Tagged PDF and "WordprocessingML"
formats (WordprocessingML is the OpenXML word processing format). The
document is not provided in HTML format as ordinary web pages, which
will severely limit access to it.
Like OpenOffice, OpenXML uses the zip format to bundle up the text of
a document in XML format with any images and other binary files into
a compressed file. As an example the "Fundamentals" section of the
standard in OpenXML format is one 240 kbyte ziped file
<http://www.ecma-international.org/news/TC45_current_work/Office%20Open%20XML%20Part%201%20-%20Fundamentals_final.docx>.
When unzipped it contains 29 files, of a total of 2.4 mbytes: three
PNG images and the rest XML. The main text of the document is in one
1.6 mbytes file ("document.xml"), with various formatting and
references in other small files.
Assuming the IT community accept Microsoft's assurances that they
will continue to make use of the format freely available, it should
prove popular. However, neither OpenXML nor OpenOffice are compatible
with a web browser and face their biggest challenge from web
standards. After an author prepares a document using OpenXML or
OpenOffice they most likely then have to render it other formats for
distribution, such as PDF and HTML.
Newer XHTML standards are providing more of the formatting expected
for word processing documents, while providing backward compatibility
with web browsers. A word processor which use an XHTML format as its
native format would provide the capability of simply saving the
document to the web for distribution. There would be no need to
convert to PDF or HTML. There would also be scope for better
integration with web tools, such as blogs, wikis and feeds.
The creation, promotion and distribution of a new word processing
package was previously a major undertaking. However, AJAX (Web 2)
based office packages could quickly render irrelevant the debate as
to if OpenXML or OpenOffice is better, by superseding them both.
Ecma's overview of OpenXML illustrates both the strengths and
weakness of both its approach and that of OpenOffice:
"OpenXML was designed from the start to be capable of faithfully
representing the pre-existing corpus of word-processing documents,
presentations, and spreadsheets that are encoded in binary formats
defined by Microsoft Corporation. The standardization process
consisted of mirroring in XML the capabilities required to represent
the existing corpus, extending them, providing detailed
documentation, and enabling interoperability. At the time of writing,
more than 400 million users generate documents in the binary formats,
with estimates exceeding 40 billion documents and billions more being
created each year."
This is a wrong headed approach to the creation of an electronic
document standard. The priority for word processing documents has
been to reliably produce printed documents which look identical.
However, the production of printed documents is now a very small part
of what a word processor is used for and should not be the priority.
Most documents are used for on-screen electronic viewing. Exact
reproduction of a printed format is exactly what is NOT needed. As a
result word processing documents have to be converted into other
formats for use. As an example, the OpenXML standard is provided in
three formats: PDF for printing, Tagged PDF for on-screen viewing and
WordprocessingML. None of these formats is particularly suitable for
on-screen viewing.
A new approach is needed where the document format is designed for
on-screen viewing with a web browser, and then the additional
features needed for printing are added. This can be done with XHTML.
Blog version at: <>.
Tom Worthington FACS HLM tom.worthington at tomw.net.au Ph: 0419 496150
Director, Tomw Communications Pty Ltd ABN: 17 088 714 309
PO Box 13, Belconnen ACT 2617 http://www.tomw.net.au/
Visiting Fellow, ANU Blog: http://www.tomw.net.au/blog/atom.xml
More information about the Link
mailing list