Using XML framing in XMPP was the opposite of simplicity. (Sure it was simple in the sense that it required zero experience with actual implementation of protocols to arrive at the conclusion, but the result was something that was harder to implement properly).
It is "simple" for moronic, bad, implementations of the protocol, but it only complicates the situation when you need performance, quality and efficiency. It complicates it greatly. In essence, you end up having to write two parsers: a shallow framing parser and a deep parser.
And if you think you are going to get any help from the fact that there are lots of XML parsers: I've got bad news for you. There's lots of XML parsers that are meant to parse documents. Not millions of simultaneous, "endless" streams of data from dodgy clients.
XMPP is not good protocol design. It is brutally stupid protocol design.
(An irony is that right now there are several areas where you would want to use a messaging protocol for small devices. This ought to be the moment where a messaging standard had a chance to shine. And XMPP ends up being one of the least desirable protocols because so little care was taken in designing it)