Edit: downvoters: please explain what's to like about HTTP2. I have a very hard time finding anything to like.
For example: no more easy debugging on the wire, another TCP like implementation inside the HTTP protocol, tons of binary data rather than text and a whole slew of features that we don't really need but that please some corporate sponsor because their feature made it in. Counter examples appreciated.
Compare: http://tools.ietf.org/html/rfc1945
Literally so: this protocol document does not specify how you determine which server to connect to. HTTP2 is, in definition, only very loosely coupled to IP despite making significant optimisations for TCP. Thus in implementation we simply get the same old mistakes and undefined behaviours. Issues with floating apex records, hacks based on IPv4/6 race conditions, unnecessary address wastage and so forth will continue; all derived from the colossal architectural wart of overloading the DNS host (A/AAAA) record as a service endpoint discovery mechanism.
Once again, I say unto the peanut gallery: shoulda used SRV. The benefits are many and the downsides greatly overstated. I bemoan the missed opportunity.
Edit: It makes me a bit giddy (which makes sense if you factor in my being a sysadmin) to think about what SRV records would've done for load-balancing, running servers on non-standard ports, IP address exhaustion, and server migrations. Anybody who doesn't appreciate proper service-location hasn't ever done serious sysadmin work and, IMO, has no business designing protocols.
You say you agree, because "the protocol does not specify how you determine what server to connect to."
In other words, HTTP2 sucks because it simultaneously includes too many features, and not enough features.
At least everyone can agree that they don't like it for some reason, even if the reasons themselves contradict each other.
Text implementations are expensive to parse. You end up doing a lot of nutty hacks (like bitwise operations a word at a time, on strings) just to have good speed. Doing so safely requires a lot of branches. A binary protocol will be easier, more compatibly implemented, safer and far faster to decode. Look at nginx code to see examples (and that's a good codebase).
Debugging, yeah, it's sometimes handy to run tcpdump and see problems. But s/tcpdump/tshark and it's essentially a solved problem. The vast, vast majority of HTTP messages are machine written and read, optimizing for humans is misguided.
And yes, I'm bitter about this, as I'm on my third project implementing a SIP stack. SIP shares HTTP's general format, with tons of added insanity for fun. Like UDP/TCP hopping based on the spec author's misunderstanding of IP reassembly. I've lost weeks of my life due to idiotic spec writers just going off and making stuff up. Cause hey, in a text document, there's no actual technical debt to be paid, no actual real world problems.
Google's approach with SPDY is far better than most of what the IETF has been able to put out.
You have any further reading or links for this? SIP is definitely "interesting" (not in a good way) and I'd love to have some background on why it is the way it is.
I'll miss easy wire debugging too, but it's been obvious for a long time that HTTPS everywhere is the future... and when you go to the effort to use a MitM proxy or whatever for debugging TLS, you may as well throw in a tool that understands the protocol.
Edit: I agree it would be nicer to solve many of the goals of HTTP2 at a lower level, and from the little I've heard about it, that solution already basically exists in the form of QUIC. But server push is a useful feature even with a perfect transport layer, so meh...
Heck, I can read a binary protocol as hex if I know it well enough - writing a tool which can do this is not a difficult task (it's actually way easier then text parsing!) and gives you more concise answers more quickly.
It also addresses the usual complaints, including the "binary" part.
The short version is that is all about speed and security (by encrypting everything).
As to the issue of binary vs. text. Implementing binary protocols (including tools used for debugging them) is much simpler than implementing text protocol. HTTP1 might look simple to human eyes but it's not an easy protocol to implement (fully and correctly). I know, I tried.
Furthermore, engineering is about tradeoffs. HTTP2 is not simple but the complexity exists to address real problems in failure modes of tcp stack and to maximize speed by both sending as little as possible (compression) and ensuring that one request doesn't block the other, as long as there is bandwidth to send both.
Maybe the purity and simplicity of the protocol is more important to you than improving the speed by few %, but to make it makes perfect sense to trade simplicity for even modest gains in performance in the context of http.
You only write the code once but it'll speed up every http connection until end of time. That's a lot of http connections which adds up to a lot of bandwidth and time savings.
Keep in mind that we're talking about the protocol that currently powers much of the developed world as we know it and that mistakes will be extremely costly to correct. HTTP has its warts and quirks but for the most part we have worked our way around those.
Minor efficiency gains do not add up for me to a complete overhaul of the web and everything attached to it.
So what we'll get instead is this: a percentage of the web will transition (mostly the bigger players where that few % adds up to a higher bottom line), the rest will not care and wait for HTTP3 or whatever will come after HTTP2 to really address what's wrong with HTTP (and that's not that it isn't fast enough, that's mostly a problem with the underlying transports, HTTP is plenty fast).
I'm not up to speed on certificate verification for HTTP2. Have the IETF suggested a mechanism? If so t must be outside the draft-ietf-httpbis-http2-16.
HTTP/1.1 is nowhere near simple or elegant. The current spec has 6 RFCs, and a real world implementation is totally non-trivial. RFC1945 is HTTP/1.0, which nobody uses it.
Unless someone proposes an acceptable better solution (Microsoft tried), the world needs to decide on something and move on.
Sure we can work around this using techniques like CSS sprites, merging JS files, CSS files etc. but these are hacks that come with their own tradeoffs - most notably caching but also sprites bring memory challenges on some devices too.
Using a single TCP connection opens up some issues where there's packet loss but it also means the TCP congestion window will grow at it's maximum rate without the risks that opening multiple connections and domain sharding bring.
Multiplexing also offers interesting possibilities of partial resource download e.g. download part of a progressive JPEG, and then the rest later.
Push allows servers to send content e.g. CSS, fonts e.g. that's in the browser's critical rendering path before the client has even discovered it needs it.
Header compression reduces both the request and response overhead, and request overhead if often forgotten in web development which is funny as most connections have asymmetric speeds.
Prioritisation allows the browser and server to schedule content download 'more intelligently' than our current browser heuristics.
In a world where most people are looking at HTTP using DevTools, WebPageTest and Wireshark rather than the native text format a binary protocol isn't a problem.
99% of the time I'm using Wireshark, wget, curl, or my own client or server code, which is always using an HTTP library. 1% of the time I'll telnet to the raw port and type in some incantation.
I don't expect the 99% cases to be different; by the time I'm dealing with it much in the wild, I expect the tools to be fine. And the telnet-by-hand case is still going to work; you'll just have to keep using HTTP 1.0 or 1.1. Which is going to be supported in practice for decades more.
With a text based protocol you at least stand a fighting chance, I've done my share of slogging though dumps and I'm not looking forward to a repeat. One of the main reasons I suspect HTTP caught on as fast as it did was because people could actually look under the hood and understand the basics and figure out where things went wrong without resorting to dumping the the data and counting out which variable length header bit got it wrong this time.
HTTP isn't perfect, don't get me wrong (not specifying an end-of-line for the header with a single character was a mistake in my opinion, and there are a few quirks that make a much faster header parser impossible but that's minor stuff).
So, the telnet by hand case will still work, but that's not where the bugs will be, in fact, testing using 'telnet' will quite possible show you a situation inconsistent with the one using the newer version of the protocol. Having two delivery methods for the same data under the hood is a bad idea to begin with.
Using SPDY only shows improvement over HTTPS. Over HTTP you get a huge performance hit!
The majority of data has no need for encryption. Some of it does. Where it does, SPDY certainly makes sense.
For example, HTTP2 will stop ISP header injection such as this: http://www.propublica.org/article/somebodys-already-using-ve...
Both still leave the original text based protocol and would stop ISP header injection.
Another defense against such trickery would be a legal one: make it illegal to tamper with data sent between two peers on the internet.
Imagine the postal service opening your mail and changing a couple of words in your mail just because they could.
Keep in mind that Hypertext as it was originally envisioned by Ted Nelson had nothing but two way links, doing one-way links broke with the view of the day in a pretty drastic manner (and actually made the whole thing possible), so I think some leeway here is allowed.
HTTP's popularity outside of the web is mainly cause it's easy to use as a simple wrapper around TCP with read and read/write semantics. Nearly all the rest of HTTP is unneeded fluff (proof: browsers didn't even do other methods besides GET/POST and applications were just fine).
https://www.varnish-cache.org/docs/trunk/phk/http20.html
previous discussion on HN:
It would have been so much nicer just to HTTP 1.2 with header compression. Then they could propose a new transport protocol (or just use SCTP) and provide layer 7 implementation over TCP or UDP to ease the transition.
8. HTTP Message Exchanges
What the hell is the rest of the document about then?
Section 8 is specifically about the semantics of the 'conversation' between the client and the server. It's important to actually know how to form the messages themselves first!
Think of it as if you were writing a specification of the English language. You might think it's all about conversations, but first you need to know what the letters are, how you form them into words, then spelling, grammar, etc.
edit: *was
I haven't checked yet but I won't be surprised to find out that Chrome keeps one SPDY/HTTP2 connection open to google-analytics.com that's shared for all domains...
If you spend less than 4 minutes per page then that entire browsing session across all google-analytics pages is uniquely identified to google. Even if you have many people behind a NAT and they have 3rd party cookies disabled, each one's browsing is still a separate clickstream to Google.
Only thing making a person's browsing opaque to google is if they browse behind a proxy with a bunch of other people mixed in, and SPDY/HTTP2 conveniently enough make proxies very difficult by needing to trust the proxy's certificate on each client. So anybody needing a proxy will use HTTP1 and anybody using HTTP2 has their browsing sessions uniquely identified.
As a result, those companies have banded together to produce a bloated, multiplexed blob-fish of a protocol which they have vomited into the standards body for fast approval, over the objections of the people who will actually have to implement it.
Kinda like BMP format to PNG format (with compression and alpha channels). Yes, BMP is a lot simpler and you can find out the RGB of any pixel by looking at pixels[y*width+x] while with PNG you have non-trivial complexity with compression, etc. But the size efficiency is worthwhile.
Converting from uncompressed BMP to compressed PNG can easily save over half of the file size; even 10:1 compression is common for some images.
The bandwidth savings in HTTP2 are much smaller, and are probably only significant in aggregate.
Binary in Hyper Text Transfer will never seem right. I understand it is more performant but it always creates more bugs, ask any game developer, binary needed but also living on the edge of indexes/ordering/headers/harder to debug/etc. Indexing, overflows, incorrect implementations, will follow.
Many of the advancements in HTTP2 are good but there are some steps backwards we'll have to re-learn again. It isn't all about performance when it comes to correct interoperability as standards lead to many interpretations, it is why XML then JSON won data transfer, it is easy to interoperate, yes binary is more efficient over the wire but not to interoperate. Should we go back to binary formats for data exchange on the network? The protocol level is lower level but still it has been beneficial in the current standards to spreading innovation with lower barriers to understanding.
HTTP2 is one of those 'version 2' of an app that some of the legacy genius of it was lost and overlooked in the redesign, like simplicity. An engineers job is to make something complex into something simple and blackboxing data isn't simplifying it.
Calling HTTP1/1.1 genius sounds like an "intelligent design" argument (as opposed to "evolution"), and I think detracts from what makes it good.
What makes more sense to me is HTTP1/1.1 was invented and then we hacked/adapted/"evolved" on top of it to get it to do what we want. It wasn't the spec that was genius - it was the effort of countless engineers overtime the crammed a genius, trillion dollar industry into an "okay" spec (The same way it was done for HTML/CSS/JS).
in that vein, the whole binary/plaintext header arguments seem a lot closer to "this is the way my father did it" rather than "this is the most efficient way". To counter your XML/JSON example - I would argue they won over binary formats because there a huge need for humans to write & edit data exchange structures. OTOH, I can't remember the last time I sent/edited/created HTTP Headers by hand. While JSON has tons of uses and is stored in countless places(config, user data, state data), HTTP servers are the only services that seem to care about HTTP headers.
Perhaps folks like 'cperciva would be kind enough to propose a single, simple TOML-based cert system that is extremely lightweight with the fewest of features. (Not that TLS/SSL would change without focused, sustained herculean effort immediately after yet another Heartbleed.)
There are so many tools supporting it properly, and many of them are extremely hard to change (especially HSMs) because there are actual dependencies, that I wouldn't bother.
The problems are mostly in SSL & TLS protocols, not in the certificates. We should get a new alternative that would be designed to be easily implementable, and it should get proper reference implementation (with proofs of correctness, which by the way are available for X509... That's the only part that is verifiable afaik).
Is this the inevitable path of any technology which has initial promise for enabling individual public expression?
Oh, wait... maybe that was a dream.
What I would like to see is the industry ask itself, can HTTP be retro-fitted to work for software over TCP or UDP? It is clear that HTTP is a fantastic protocol for sharing documents. But it is what we want when our goal is to offer software as a service?
I'll briefly focus on one particular issue. WebSockets undercuts a lot of the original ideas that Sir Tim Berners-Lee put into the design of the Web. In particular, the idea of the URL is undercut when WebSockets are introduced. The old idea was:
1 URL = 1 document = 1 page = 1 DOM
Right now, in every web browser that exists, there is still a so-called "address bar" into which you can type exactly 1 address. And yet, for a system that uses WebSockets, what would make more sense is a field into which you can type or paste multiple URLs (a vector of URLs), since the page will end up binding to potentially many URLs. This is a fundamental change, that takes us to a new system which has not been thought through with nearly the soundness of the original HTTP.
Slightly off-topic, but even worse is the extent to which the whole online industry is still relying on HTML/XML, which are fundamentally about documents. Just to give one example of how awful this is, as soon as you use HTML or XML, you end up with a hierarchical DOM. This makes sense for documents, but not for software. With software you often want either no DOM at all, or you want multiple DOMs. Again, the old model was:
1 URL = 1 document = 1 page = 1 DOM
We have been pushing technologies, such as Javascript and HTML and HTTP, to their limits, trying to get the system that we really want. The unspecified, informal system that many of us now work towards is an ugly hybrid:
1 URL = multiple URLs via Ajax, Websockets, etc = 1 document (containing what we treat as multiple documents) = 1 DOM (which we struggle against as it often doesn't match the structure, or lack of structure, that we actually want).
Much of the current madness that we see with the multiplicity of Javascript frameworks arises from the fact that developers want to get away from HTTP and HTML and XML and DOMs and the url=page binding, but the stack fights against them every step of the way.
Perhaps the most extreme example of the brokenness are all the many JSON APIs that now exist. If you do an API call against many of these APIs, you get back multiple JSON documents, and yet, if you look at the HTTP headers, the HTTP protocol is under the misguided impression that it just sent you 1 document. At a minimum, it would be useful to have a protocol that was at least aware of how many documents it was sending to you, and had first-class support for counting and sorting and sending and re-sending each of the documents that you are suppose to receive. A protocol designed for software would at least offer as much first-class support for multiple documents/objects/entities as TCP allows for multiple packets. And even that would only be a small step down the road that we nee d to go.
A new stack, designed for software instead of documents, is needed.
I would have been happy if they simply let HTTP remain at 1.1 forever -- it is a fantastic protocol for exchanging documents. And then the industry could have focused its energy on a different protocol, designed from the ground up for offering software over TCP.
Waiting months/years for HTTP\2 support to appear in all the tools I use - :( ....