You have any further reading or links for this? SIP is definitely "interesting" (not in a good way) and I'd love to have some background on why it is the way it is.
Anyways, because they thought UDP could not exceed Ethernet MTU, they came up with this crazy protocol hopping behavior. So on a per message basis, a SIP server or client can just flip to TCP or UDP, just for fun. This idiocy is what made MS just drop UDP support in their SIP products, because "most of our messages will be over the MTU and compliant implementations will hop to TCP". But popular stacks such as Asterisk had baked UDP assumptions pretty deep.
To make it even more fun, SIP TCP and UDP differ. In UDP, they decided content length isn't needed since you can just use the datagram boundary. So you get this situation where the validity of their text-bsed message depends on which transport it was delivered on.
The sick thing is that the authors of these RFCs take delight in the pointless complexity they've added. Look up SIP torture tests RFC just to see how demented they are.
In their defense, one of them wrote me and said they had started off thinking SIP would be totally HTTP compatible. After they dropped that idea, they left stuff in because it was "too late". While I understand life's tough, this explanation doesn't do much for my confidence in their decisions. Another one wrote and said " hey, C and other programming languages have flexible syntax, why not SIP?". He wasn't joking :/. But hey, one of them is now the CTO of the FCC, so writing terrible specs must be a good career move.
Worse, many people see " text based " and think it's easier than a proper binary protocol. Many shitty coders have pumped out bad SIP implementations that sorta work. But in fact, you cannot unambiguously parse SIP on the Internet today. Popular implementations disagree on basic things such as line endings. This opens up security holes, as you can make one network element read a message one way, knowing the next element will interpret it differently.
And as you point out, the limit is quite low in such cases. So it'd be better to just drop UDP (for many reasons), versus adding protocol hopping and whatnot.
...
Wait, NIST wrote the reference implementation? Also, your description sounds suspiciously like someone was going for maximum confusion and a "design" that practically guarantees bugs in every implementation.
This stinks of BULLRUN-style sabotage.
And a lot of this stuff is the result of carrying over design decisions, literally, from the 60s. Someone wrote an RFC codifying what they were doing by hand. And that nonsense gets passed forward for zero reason.
Ever wonder why HTTP uses the idiotic "Thu, 28 Jan, 1998, GMT" format for datetime? Because that format makes sense if you're hand-reading email message headers. And HTTP just copied it forward, zero thinking involved. Probably the same reason you can have comments in headers. SIP gives an example:
Retry-After: 300 (I'm in a meeting)
To real-world engineers, we say, wow, that's idiotic. Comments will never be used, so they'll never be properly implemented and only serve to make things problematic.To an IETF RFC author, that kind of behaviour is "well RFC xxx does it that way", and ultimately comes to "well, like, if you were sending HTTP messages by fax, extra comments might help".
If you stack up enough hacks like that, then it starts to become the clusterfuck it is. I don't think there's any maliciousness involved. Just lack of critical thinking, or lack of experience actually implementing software. Plus the fact that when writing a spec, you've got a blank paper and can just go all-out with crazy ideas you won't have to support.
The process for re-invites with proxies in the middle, loose routing, nats, and codec renegotiation is a VoIP engineer's hell.
But all those extensions/rules were created with good intentions.