He taught an entire course on XML, which he calls a "great meta-example on how to deal with semi-structured data"? And his only defense of XML over JSON is... it's worked ok for some file formats?
The only point in this whole article is that XML is not well-suited for RPCs, though he fails to argue that it's well-suited for anything else.
One argument is that XML is better than JSON for use cases like XHTML, where you heavily mix tags and content. I get the feeling XML wasn't really made for this case, though, it was made for the JSON-like case. Processing XHTML with E4X (the "XML for JavaScript" standard) is painful, and XML libraries in general assume your document basically consists of a tree of tags, maybe with text nodes at the leaves.
I was expecting some argument invoking the power of DTDs and XSLT or whatever else, or the original point of XML that people overlook, and all I got was an extremely weak defense of XML from someone who taught a whole course on it.
Back in 1997, XML was "SGML for the Web." It was a way to pass around structured, plain-text, human-readable documents that did not require expensive, buggy, incomplete parsers.
It then got misapplied as an RPC transport encoding, and tools vendors were more than happy to start pushing specs, such as W3C Schemas, that demanded the use of tools.
It started out to be simple, but, as things happen, got hijacked. But the fault is with the misapplication, not XML itself.
Sadly there were some sensible early formats that were left behind. XML-RPC's serialization is a bit verbose but otherwise is quite similar to JSON. Somehow that got turned into SOAP and then eventually the WS- tar pit of complexity.
Likewise XML as a configuration file language can be quite elegant, almost like a literate coding version of common .ini or .conf files. But instead of this simple flat document littered with variables, xml config files in the wild end up with deeply nested structure that contributes dubious value and makes the files far less human friendly.
XML itself, with the possible exception of namespaces and a few other features, is quite simple. I totally agree it's the applications that have gotten out of hand, particularly in areas where XML is used as structured data exchange rather than document markup.
Now that we have JSON, there is no longer any excuse.
XML is good for exactly what it stands for: an extensible markup language. It's good for dealing with semi-structured data, especially when you have to deal with data from multiple domains.
Have you ever used SGML (other than HTML)? If so, then you'd likely agree that XML is a superior standard. But I'm guessing that you have not, because for some reason you believe that XML was created for data serialization.
DTDs and XSLT _are_ useful aspects of XML and I doubt the author in unaware of them. Rather the author assumed too much of the readers in understanding the history of XML and the nature of semi-structured data.
How about that way:
{tag: "p", class: "content", text: ["Then help me translate this into ", {tag: "span", class: "highlight", text: "JSON"}, " please"]}
And that's just a small example where you can see all the start and end tags on one screen. Now change the example to insert a hyperlink, say, around the word "help". How easy is it to change?
Yes, this is true. The point of using XML is when you have data where you know the structure of some parts, but not others. This is true of most things that begin life as prose, and then have some structure added to them later. It is a point between "bag of words" information retrieval, and SQL queries, that requires a different approach.
"I get the feeling XML wasn't really made for this case, though, it was made for the JSON-like case."
No, this is false. XML is awful for the JSON like case. What would make you think that XML was created for it?
You could argue that XML documents are complex and cannot be described using simple comma separated. Maybe but some many XML documents are just there to store simple key,value data.
And now, we have "jsawk" (https://github.com/micha/jsawk) for parsing JSON under your terminal...
That's not a bad thing. The UNIX philosophy encourages you to avoid those things if you don't need them. It's very powerful. But when you actually, factually need them, you're not going to get very far with UNIX tools. That's OK; it is neither an indictment of UNIX nor of the data. Different tools are called for.
the only real complaint I have is that xsl, being itself xml, is pretty verbose and can be tedious to write.
also the whole "using xml to define a transformation on some other xml" thing is so overly meta as to induce a massive brain hemorrhage out of my nose and all over my desk.
That's laws of natural selection at work.
I know a guy who deployed a Java application on servers with 64MB of memory, and he did it back before the JIT compiler was any good. It was performant and got the job done. He's not unique: lots of performant Java applications were built on hardware that was tiny compared to today's hardware. But for some reasonable meaning of "everybody," everybody writes horrible bloated Java code that requires costly hardware to run.
I've used simple, practical XML web services -- in fact, we have several running at work, and when adding or changing functionality, dealing with the XML aspect is a rounding error compared to implementing the application logic. But for some reasonable meaning of "every," everybody writing enterprise XML web services creates overengineered, overcomplex, finicky interfaces that require ongoing error-prone tweaking of DOM or SAX code.
Sometimes when everybody's getting it wrong, that just means "it" has proved irresistible to stupid people and PHBs. It doesn't mean a sensible, tasteful engineer won't be able to use it correctly. Ditching a technology because stupid people love to misuse it may be a good fashion choice, and it may have a good way to influence hiring if you don't have more direct influence, but there's no engineering justification for it.
And don't forget that for some reasonable meaning of "everybody," everybody who has tried Lisp programming has become horribly lost and failed to accomplish anything with it. (This may be less true since Lisp is rarely taught in colleges nowadays, but it was true at some point in time.)
http://searchyc.com/submissions/xml?sort=by_date
First result.
I think one of the persons who best understood XML was Erik Naggum, or at least few have explained it so eloquently: