Welcome Guest!
Create Account | Login
Locator+ Code:

Search:
FTPOnline Channels Conferences Resources Hot Topics Partner Sites Magazines About FTP RSS 2.0 Feed

email article
printer friendly

Is There an (X)Doc in the House?
XDocs will likely breathe new life into Microsoft Office.
by Kurt Cagle

Posted December 11, 2002

The initial feeding frenzy brought on by Microsoft's announcement of a new XML editor for the next generation of Microsoft Office, not unexpectedly code-named XDocs, has passed. Now that the dust has settled, let's see what has actually been wrought.

The concept of an XML version of the Word, Excel, and PowerPoint applications has long been a concept that people in the XML community have kicked around—"Wouldn't it be great if the Word format could actually be manipulated in XML? I wish I could get my Excel data in XML; isn't there some easier way?"

Where there is a problem, there is a market for solving that problem. But it's interesting that in most cases the solutions emanated from the open source Linux community, desperate to one-up Microsoft's Office suite with OSS tools. Open Office 1.0 (which shares the code base with Sun's Star Office 6.0) provides a reasonably good attempt at this by saving documents into a native SXW format, which not coincidentally happens to be a ZIPped collection of XML documents. Especially for Java users, such a format has been a real boon to developers building integrated document management systems around XML.

However, this approach has more than a few limitations; the document schemas so created are written in a fairly complex descriptive schema that doesn't necessarily contain a lot of semantic data about the contents, but rather provides more of a page-layout architecture. Moreover, the format encapsulates binary graphics as separate files that are also contained within the SXW package.

So, the bar that Microsoft needs to clear has been one where the user didn't have much control over the schema itself, and that the company hasn't done so is surprising given its other pushes with XML.

From Application to Document
XDocs is supposed to change all that. Technically speaking, XDocs refers not to the format itself, but rather to the application used to create and manipulate this XML format. In essence, the XDocs editor will likely end up superceding Microsoft Word, and will offer much the same functionality that Word does. You will create documents just as you did before, except you'll be able to specify the underlying schema (presumably) to which specific word-processing elements get mapped. This means you could create a document based on its semantics—the meaning of each style or tag—and still be able to handle much of the presentation-layer work that has proved to be problematic when working with an XML editor.

Creating semi-structured XML has never been easy, precisely because XML by itself contains no semantics; rather, it provides a layer of semantics on top of what is largely an ASCII (or Unicode) file. By requiring that writers mark up their prose according to an arcane and often difficult-to-read XML format (the infamous brackets and quotation marks that make up the base XML syntax), any IT administrator is asking for a headache, especially when such files are legible but not necessarily semantically useful. Most people, even XML experts, find writing such mappings to a document both tedious and error-prone, and a semantic editor could significantly boost the appeal of such semi-structured data formats without necessarily requiring a large infrastructure investment.

According to the news releases, the XDocs application is supposed to be able to save a Word(-like) set of documents as a single XML document—graphics, forms data, tables, and so forth. This is not actually all that far-fetched, if you make the presupposition that you encapsulate binary graphics formats in some type of encoding scheme (MIME, DIME, or some derivative thereof). The XSL-Formatting Objects (XSL-FO) specification takes a similar approach, with vector graphics encoded as SVG, and with metadata potentially encoded as RDF. Indeed, one characteristic of XML is the fact that any set of XML documents can be represented as a single super-document with the appropriate namespaces to designate the various types of tags.

Back to top











Java Pro | Visual Studio Magazine | Windows Server System Magazine
.NET Magazine | Enterprise Architect | XML & Web Services Magazine
VSLive! | Thunder Lizard Events | Discussions | Newsletters | FTPOnline Home