[an error occurred while processing this directive] [an error occurred while processing this directive]
[an error occurred while processing this directive]

[an error occurred while processing this directive]

















Tech Update 
OPINION

The horror of XML
By Larry Seltzer
October 29, 2002

TalkBack! Add your opinion

[an error occurred while processing this directive]

I don't know about you, but I'm scared. XML is becoming complex, and it was an inherently fat way to represent data to begin with.

In fact, according to the W3C (the keepers of the Web), "XML is verbose, but that is not a problem." This was a conscious decision by XML's designers!

But of course, programmers have always been willing to cheat to get improved performance, and apparently the cheating has begun in the XML world. According to Web analyst group ZapThink, programmers are breaking all the rules in order to overcome that inefficiency.

The promise of interoperability through XML has always been central to the general enthusiasm for it. For the XML Web services vision to work there must be agreement on and adherence to standards. In fact, as ZDNet News pointed out, it's early in the game and there are already a lot of standards to deal with.

[an error occurred while processing this directive]
Some of the techniques are relatively harmless, such as stuffing lots of attributes into elements (for example, <shirt color="red"/> instead of <shirt><color>red</color></shirt>). This is legal and saves the parser from having to handle several tags, although they also lose some of the hierarchical structure of the data. Another option is using very short tag names like <g1> instead of <InventoryUpdateHandlerName>.

According to ZapThink senior analyst Ronald Schmelzer, the bigger question is, when is it acceptable to compromise interoperability for efficiency reasons? This happens when programmers rewrite the XML parser (as indeed they can, since there are many open source parsers) to ignore certain standard aspects of the language, thereby speeding up the parsing process.

Even scarier is the emerging practice of rewriting parsers not to require valid XML, such as not requiring end tags. Scary!

This sort of nonsense caused no end of problems for HTML. There's plenty of blame to go around for that debacle, mostly to Netscape, but the situation with XML is different. If XML parsers are not based on standards--hopefully open standards--we'll be stuck with the same miserable situation, where you don't know which browser can properly render which pages.

If I were XML Dictator, of course I would put an immediate stop to this. Sadly, until I'm elected, it will continue, and it's not entirely that simple anyway. Say you're writing software where you control all ends of XML-based communication, and you don't need CDATA (comments which are ignored by a standard XML parser). You might get a significant performance boost by rewriting the parser not to handle them, and in a transactional environment it could make the difference between a fast version of an application and a version that times out. Schmelzer says some programmers are doing this, and it's hard to tell them they are wrong. If it works, it works.

On the other hand, if you're compromising the interoperability of XML, you're effectively using a proprietary format, so you've got one less very big reason to use XML to begin with. Why not use CORBA or COM or some other binary interface that's far more efficient to begin with?

So the real question should be how the inefficiency problem can be addressed in a way that doesn't compromise the goals of interoperability and transparency in XML. XML is inefficient in three major ways: It uses lots of bandwidth, lots of storage, and lots of processing power.

My first guess was that bandwidth was the most significant problem, and that the obvious solution was a standard for XML compression (XML should compress very efficiently). A standard for compression could address both the bandwidth and storage inefficiencies, although storage is about as expensive as air these days, so that's got to be the problem with lowest priority. The standardization could be tricky though, since it's not clear how a standard for compressed XML would be implemented. Remember, XML can be transported over any number of protocols, from HTTP to SMTP to sneakernet. I can imagine ways it could be implemented in network protocols, perhaps through HTTP headers, addressing only the bandwidth part of the equation.

But as Schmelzer says, the processing issues can be significant. He has come across cases where an increase in efficiency could eliminate the need for a $50,000 application server. That's decent money. Some vendors have even begun offering hardware acceleration for XML processing, such as DataPower's XA35 XML Accelerator. Remind me to take a closer look at this sort of thing.

How would the compression parts of the solution be standardized? This may be too urgent a matter to leave in the hands of the W3C. Schmelzer feels that the WS-I may be a more expedient place for an agreement, if not a standard, to emerge. An actual standard is not needed immediately, merely something that everyone agrees on. It could be turned into an actual standard later.

So before the horror of XML anarchy and balkanization strikes, save yourselves! XML cheating is a trap from which you will never escape! Bwahahahahahhah!

Can XML's inefficiency problem be addressed in a way that doesn't compromise interoperability and transparency? TalkBack below or e-mail us with your thoughts.
[an error occurred while processing this directive]
[an error occurred while processing this directive]




[an error occurred while processing this directive]

ARTICLES
XDocs vs. Adobe: No competition
Solving the last-mile data problem
Web services need structure
MS Office 11--risky business?
PRODUCTS
VelociGen Enterprise Edition
Cape Clear CapeStudio
Borland Delphi
Borland JBuilder





TECH UPDATE TODAY DAILY:
Dan Farber and David Berlind deliver daily insights on the business and technology news that matters to enterprise IT.


Enterprise Alerts
IT Management
IT Professionals
Online Shopping
System Administration
Linux

Manage My Newsletters





[an error occurred while processing this directive] [an error occurred while processing this directive]