Interestingly, one of the big CDN providers did have controls in their UI for explicitly allowing/disallowing Vary header entries but they disabled it for us at some point (e.g. it was still in the UI but greyed out). I assumed once we hit a certain level of traffic it was too computationally expensive? Ever since, I've avoided any kind of fancy header/response variance in APIs just in case I end up in the same situation. It is rarely a necessity. IIRC, the only thing they continued to support variance wise was gzip (e.g. content-encoding).
It's also worth noting they were extremely conservative with query parameters too. Also to reiterate, this was very high traffic and high volume with expectations of low latency, so probably not applicable to most people using CDNs for static website assets.
Seems strange; AFAIK in e.g. Varnish, Vary just means you get more "stuff" tacked onto the buffer that gets built from the request and then hashed to create the cache key.
And actually, come to think of it, if memory for N concurrent in-flight requests is the concern, then you don't even need an actual (dynamically allocated) buffer, either; presuming you're using a streaming hash, you can feed each constituent field directly into the hasher, with only the hasher's (probably stack-allocated) internal static buffer for blockwise hashing required. (Which you're gonna need regardless of whether you're doing any Vary-ing.)
So it's really just a question of how many CPU cycles are being spent hashing. And it's likely just going to be a difference between hashing 300 bytes (base request — hostname, path, headers that are always implicitly Varied upon) and 350 bytes (those things, plus whatever you explicitly Varied) per request. Doesn't seem like too much of a win... (especially when hardware-accelerated hashing ops operate on blocks anyway, such that you only get stepwise cost increases for every e.g. 128 bytes.) I wonder why they bothered?
I think that is probably why some CDNs have a single configuration for any HTTP headers you want to vary on (e.g. Cloudfront allows you to specify a global configuration for a distribution that takes into account specific headers). This avoids the problem of both per-resource and inter-datacenter consistency that relying on the Vary header might cause.
> The Vary HTTP response header describes the parts of the request message aside from the method and URL that influenced the content of the response it occurs in.
In other words, if the server backend has a resource with representations that Vary on header values {A,B,C,D}; and one client sends req headers {A,B} — then by the standard they should only be told `Vary: A, B`; while if another client sends req headers {C,D}, then they should only be told `Vary: C, D`. The client should not be told in the Vary response header, about request headers they didn't send.
So it's not just that you can wait for the backend to send a `Vary` response header, and then medium-term cache the value of that header in the cache-policy metadata for the cache key. Instead, on each response, you need to
1. collect any additional Vary fields from the response and add them to your cache-policy Vary set; and
2. have some idea of what the "default header value" would be, to use as a fallback value when computing the cache key, for each header that isn't sent, when it's part of the active Vary set, so that you can dedup requests that explicitly send the header with value X, with request that don't send the header at all but where the default value would be X.
3. Also, ideally, you have a library of normalization transforms for the value of each header used in Vary, to decrease cardinality (the approach of this taking up the majority of the page space on the Varnish docs for Vary: https://varnish-cache.org/docs/3.0/tutorial/vary.html)
And the knowledge required to do all this correctly is really... not knowledge that a middlebox has any good way of acquiring.
This is starting to feel like a design smell in HTTP. Maybe zero-RTT content negotiation is misguided?
What if we instead did content negotiation like this (which — correct me if I'm wrong — would be a mostly ecosystem-backward-compatible change):
- if a resource negotiates, then by default, the server will send a 406 error response for all attempts at retrieving the resource. It sends this because the client itself needs to prove it knows what fields the resource varies on — and, of course, it doesn't know (yet), because nobody's told it yet. This 406 response contains a novel "Should-Vary" response header, informing the client of what it should be sending.
- to actually fetch a resource representation, the client is then expected to make the same request again, but this time, sending an Expect-Vary request header, the value of which matches the Should-Vary header value it saw from the server. Note that unlike with the Vary response header, this Expect-Vary request header should include header names that aren't part of the set of headers it's sending. (And/or, this list should force the client to emit explicit headers with its choice of implicit-default values for any headers listed in its own Expect-Vary header.)
- Upon receiving a request for a resource that negotiates, where the request has the Expect-Vary header set, the server will first verify that the Expect-Vary header value matches the Should-Vary value it would return for the resource, and either matches or is a superset of the Vary value it would compute as the response header given 1. the resource and 2. the rest of the received request. If this verification fails, that's a 406 again, sending Should-Vary again. If the verification passes, and the rest of the HTTP state workflow goes through, then you get a 2XX response. This 2xx response has the old Vary header as part of the response — but it now only exists for ecosystem back-compat.
- If a client thinks it knows the right Expect-Vary header to send, it can try sending it as a request header in the initial request. After all, the worst that can happen is the same 406 error it'd get otherwise. As well, the observed Should-Vary response header value of a resource can be cached basically indefinitely by the browser in its Expect-Vary cache, since the next time it changes for a resource, the browser will try its cached value for Expect-Vary in the request, and get a 406 response that tells it the new Expect-Vary value it should be using instead.
- Optionally, for efficiency, there could be introduced an Others-Should-Vary response header with the value being a path pattern (similar to a Set-Cookie Path field), which specifies other path prefixes for the host that should all be assumed by default to have the same Should-Vary header value as the response does. Potentially, a Should-Vary response header could also be sent in OPTIONS responses, to set a fallback assumed Vary value for the HTTP origin as a whole. (Clients are already requesting OPTIONS for CORS anyway; may as well give them some more useful information while we've got them on the line.)
With this design, middleboxes could safely trust the client's Expect-Vary header and use it to build the cache key — as long as 406 responses aren't cached.
Something for an RFC, maybe?
If this had been a necessity then I would have probably dug into the request more deeply. It was a "pick your battles" kind of thing. Extremely low cost on our side to change, no reason to bother if they claimed it would decrease problems on their side.
I think the best suggestion was in another thread by @johncolanduoni where he pointed out the difficulty of storing, distributing and retrieving the metadata per-resource that would be necessary for each PoP to correctly determine the Vary requirements at request time.