I've been reading a lot of REST vs SOAP falderall lately and it's
getting tiresome. Well, some of it is interesting, like looking
at
whether
Bloglines is REST. Anyway, I thought I'd point out the cowman and
the farmer can be friends, at least when we both are discussing the
smell of the fertilizer. So, four dumb things about XML as a wire format for
distributed systems:
- XML is text. You have to base-10 encode numbers in XML. This is terribly slow
and inefficient.
- XML can't handle binary data. There is no reasonable way to embed
a 400k image into your XML. Your choices are to base 64 encode it
(whee!) or use some wrapper around the XML like MIME.
- XML is awfully wordy. I don't begrudge the meaningful beginning
tag names and the pointy brackets, but the meaningful closing tag
names are superfluous if all your XML is machine generated.
- XML is complicated. We love XML because it's S-expressions, but
it's a lot more too. Entities! PIs (whatever those are). Attributes
vs. elements! Three different ways to describe the data model! Awful
programming models! It's awfully
complicated when what you're trying to do is pass a couple of numbers
and a string.
The roots of XML are SGML, a hand-edited markup language for writing
documents. I think it's clever that it's been repurposed for
distributed systems, particularly since a human can easily read the
packets without translation. And I really like the idea of
document-oriented web services (whether SOAP or REST). But it'd sure
be nice if the document were more friendly to computer data.