Microsoft and ECMA are now two-thirds of the way through dealing with the 3,500 comments they received from the national standards bodies in response to their submission of the Office 2007 file formats to ISO/IEC for standardisation. It appears they’re finally going to do something concrete about the perceived flaws in the formats rather than just argue the toss and tinker with the wording, which is all that seems to have happened until now.
Spreadsheet applications will be allowed to store dates in ISO 8601 format (for example, “2008-01-01”) rather than only as the number of days since 01/01/1900, which may slow down the loading and saving of files somewhat. It will also pave the way toward dealing with dates before 01/01/1900 and with the troublesome leap-year bug, whereby 1900 is wrongly assessed to be a leap year. Exactly how the leap-year bug will be eliminated hasn’t yet been made public – Microsoft has previously said that this couldn’t be done without causing some people’s workbooks to break, but, having accepted the need for the change, it will now need to work to minimise the inconvenience to users.
Languages in OOXML files will change from being specified using a restricted set of integer values to using ISO 639 abbreviations for the language such as en, de and fr, and ISO 15924 abbreviations for the accompanying scripts like Arab, Cher and Latn. This will enable applications to distinguish between, for instance, a Serbian language file written using Cyrillic (sr-cyrl) and one using Latin script (sr-latn). Currently, OOXML files support a fixed range of possible page borders, but ECMA proposes to open this up to allow custom page borders, and finally weeks will be able to begin on any day, not just on Sunday or Monday.
ECMA and Microsoft will translate the definitions of “advanced functionality” such as spreadsheet formulas and word processing fields into the ISO/IEC 14977 “syntactic metalanguage” to aid understanding and improve the ability of implementers to test conformance. They’ll also split the specification into different sections to make it clearer which parts are mandatory, which are optional and what’s deprecated. Deprecated features, such as VML (vector markup language), will be moved into an annex, and any new documents produced in the OOXML format must not contain such features. Deprecated features and various compatibility settings such as AutoSpaceLikeWord95 should only be used from now on in documents converted from previous formats. Microsoft will also fully document the compatibility settings, stating clearly what behaviour is expected in each case. Previously, implementers had been left to examine the old software themselves and try to work out what each setting did.
All these changes, and possibly many more by the time all the comments are dealt with, will leave Microsoft with a considerable headache if the OOXML file formats are accepted as an ISO/IEC standard at the Ballot Resolution Meeting in February. Office 2007 itself will not comply with the standard and hence will have to be patched: how long it will take to release such a patch (and to patch the compatibility pack for Office 2003, XP and 2000 – assuming this is actually possible) is anyone’s guess. Microsoft may choose to wait until Office “14” is released, probably some time in 2009.
Meanwhile, Sun, IBM, OpenOffice and OASIS have the opposite problem, as the ISO accepted their ODF file formats as a standard (ISO/IEC 26300) in May 2006. Since then, OASIS has published version 1.1 and is working on 1.2, which will finally define the formulas in spreadsheets, but there’s no word yet about when it will resubmit a new version to ISO. So while OpenOffice, IBM Lotus Symphony and other suites now write ODF 1.1 files, they’re no longer ISO standard because the standard is stuck at version 1.