undefined | Better HN

0 pointszoogeny2y ago0 comments

I haven't dealt with Cloudflare specifically, but I did deal with a number of big CDNs for large amounts of traffic. They were pretty adamant about NOT supporting arbitrary Vary header values. It broke some logic on a few of our systems and we eventually just decided to work around it instead of pushing our case.

Interestingly, one of the big CDN providers did have controls in their UI for explicitly allowing/disallowing Vary header entries but they disabled it for us at some point (e.g. it was still in the UI but greyed out). I assumed once we hit a certain level of traffic it was too computationally expensive? Ever since, I've avoided any kind of fancy header/response variance in APIs just in case I end up in the same situation. It is rarely a necessity. IIRC, the only thing they continued to support variance wise was gzip (e.g. content-encoding).

It's also worth noting they were extremely conservative with query parameters too. Also to reiterate, this was very high traffic and high volume with expectations of low latency, so probably not applicable to most people using CDNs for static website assets.

0 comments

9 comments · 2 top-level

derefr2y ago· 7 in thread

> I assumed once we hit a certain level of traffic it was too computationally expensive?

Seems strange; AFAIK in e.g. Varnish, Vary just means you get more "stuff" tacked onto the buffer that gets built from the request and then hashed to create the cache key.

And actually, come to think of it, if memory for N concurrent in-flight requests is the concern, then you don't even need an actual (dynamically allocated) buffer, either; presuming you're using a streaming hash, you can feed each constituent field directly into the hasher, with only the hasher's (probably stack-allocated) internal static buffer for blockwise hashing required. (Which you're gonna need regardless of whether you're doing any Vary-ing.)

So it's really just a question of how many CPU cycles are being spent hashing. And it's likely just going to be a difference between hashing 300 bytes (base request — hostname, path, headers that are always implicitly Varied upon) and 350 bytes (those things, plus whatever you explicitly Varied) per request. Doesn't seem like too much of a win... (especially when hardware-accelerated hashing ops operate on blocks anyway, such that you only get stepwise cost increases for every e.g. 128 bytes.) I wonder why they bothered?

johncolanduoni2y ago

Respecting vary headers is not this simple. Given a request, how do you calculate a cache key that includes only the Vary headers? You only get that list in actual responses from the server, so you need to actually look at some information derived from previous responses to determine what to hash on each request. This is called "partial match retrieval", and is much more complicated (and computationally intensive) than cases where you can calculate a hash key as a pure function of the request.

zoogenyOP2y ago

This isn't something I considered but it totally makes sense. Given that the Vary header is a per-resource value you would have to propagate that through the network. For millions of resources that might become an issue. And since in a worse-case scenario the server could be changing the Vary header for a single resource across multiple requests you have the additional problem of trying to keep it consistent across datacenters.

I think that is probably why some CDNs have a single configuration for any HTTP headers you want to vary on (e.g. Cloudfront allows you to specify a global configuration for a distribution that takes into account specific headers). This avoids the problem of both per-resource and inter-datacenter consistency that relying on the Vary header might cause.

derefr2y ago

It now occurs to me that even what you're describing wouldn't be enough, because, as MDN says [emphasis mine]:

> The Vary HTTP response header describes the parts of the request message aside from the method and URL that influenced the content of the response it occurs in.

In other words, if the server backend has a resource with representations that Vary on header values {A,B,C,D}; and one client sends req headers {A,B} — then by the standard they should only be told `Vary: A, B`; while if another client sends req headers {C,D}, then they should only be told `Vary: C, D`. The client should not be told in the Vary response header, about request headers they didn't send.

So it's not just that you can wait for the backend to send a `Vary` response header, and then medium-term cache the value of that header in the cache-policy metadata for the cache key. Instead, on each response, you need to

1. collect any additional Vary fields from the response and add them to your cache-policy Vary set; and

2. have some idea of what the "default header value" would be, to use as a fallback value when computing the cache key, for each header that isn't sent, when it's part of the active Vary set, so that you can dedup requests that explicitly send the header with value X, with request that don't send the header at all but where the default value would be X.

3. Also, ideally, you have a library of normalization transforms for the value of each header used in Vary, to decrease cardinality (the approach of this taking up the majority of the page space on the Varnish docs for Vary: https://varnish-cache.org/docs/3.0/tutorial/vary.html)

And the knowledge required to do all this correctly is really... not knowledge that a middlebox has any good way of acquiring.

This is starting to feel like a design smell in HTTP. Maybe zero-RTT content negotiation is misguided?

What if we instead did content negotiation like this (which — correct me if I'm wrong — would be a mostly ecosystem-backward-compatible change):

- if a resource negotiates, then by default, the server will send a 406 error response for all attempts at retrieving the resource. It sends this because the client itself needs to prove it knows what fields the resource varies on — and, of course, it doesn't know (yet), because nobody's told it yet. This 406 response contains a novel "Should-Vary" response header, informing the client of what it should be sending.

- to actually fetch a resource representation, the client is then expected to make the same request again, but this time, sending an Expect-Vary request header, the value of which matches the Should-Vary header value it saw from the server. Note that unlike with the Vary response header, this Expect-Vary request header should include header names that aren't part of the set of headers it's sending. (And/or, this list should force the client to emit explicit headers with its choice of implicit-default values for any headers listed in its own Expect-Vary header.)

- Upon receiving a request for a resource that negotiates, where the request has the Expect-Vary header set, the server will first verify that the Expect-Vary header value matches the Should-Vary value it would return for the resource, and either matches or is a superset of the Vary value it would compute as the response header given 1. the resource and 2. the rest of the received request. If this verification fails, that's a 406 again, sending Should-Vary again. If the verification passes, and the rest of the HTTP state workflow goes through, then you get a 2XX response. This 2xx response has the old Vary header as part of the response — but it now only exists for ecosystem back-compat.

- If a client thinks it knows the right Expect-Vary header to send, it can try sending it as a request header in the initial request. After all, the worst that can happen is the same 406 error it'd get otherwise. As well, the observed Should-Vary response header value of a resource can be cached basically indefinitely by the browser in its Expect-Vary cache, since the next time it changes for a resource, the browser will try its cached value for Expect-Vary in the request, and get a 406 response that tells it the new Expect-Vary value it should be using instead.

- Optionally, for efficiency, there could be introduced an Others-Should-Vary response header with the value being a path pattern (similar to a Set-Cookie Path field), which specifies other path prefixes for the host that should all be assumed by default to have the same Should-Vary header value as the response does. Potentially, a Should-Vary response header could also be sent in OPTIONS responses, to set a fallback assumed Vary value for the HTTP origin as a whole. (Clients are already requesting OPTIONS for CORS anyway; may as well give them some more useful information while we've got them on the line.)

With this design, middleboxes could safely trust the client's Expect-Vary header and use it to build the cache key — as long as 406 responses aren't cached.

Something for an RFC, maybe?

zoogenyOP2y ago

To be clear, I'm not trying to make their argument for them since we spent probably 1 day working around it. I'm just passing along an anecdote. One day, Vary header stopped working on one CDN and we had to fix it. When I spoke to our account rep (I literally had a weekly call with them due to our usage) he said they were phasing it out for performance reasons. Not long after we got notice from another CDN asking for similar consideration. I have no inside knowledge as to their infrastructure or systems that made this a requirement. I very much doubt it was the cost to hash, maybe more likely something to do with their network topology and how requests were routed from origin to regional tiers to PoPs? I'm totally speculating here.

If this had been a necessity then I would have probably dug into the request more deeply. It was a "pick your battles" kind of thing. Extremely low cost on our side to change, no reason to bother if they claimed it would decrease problems on their side.

phanimahesh2y ago

The cost of vary headers is usually not in hashing the keys but storing multiple entries per url in an arbitrarily large combination of headers. I can imagine cdns not wanting the hassle, though I don't live the outcome.

naasking2y ago

I'm not sure that tracks. If those variants are used, then eliminating support for Vary means they'll just it with new endpoints that return the same thing, so total number of cache entries remain unchanged.

zoogenyOP2y ago

It's worth noting that the cost of storage wasn't the issue in this case. They already had a system that allowed you to determine which headers in the Vary list would be respected and so you could calculate a worst-case storage load. I mean, it definitely was an issue in general and we were careful about avoiding the same content being stored multiple times but it wasn't the reasoning they communicated behind the change in the anecdote I related.

I think the best suggestion was in another thread by @johncolanduoni where he pointed out the difficulty of storing, distributing and retrieving the metadata per-resource that would be necessary for each PoP to correctly determine the Vary requirements at request time.

wbl2y ago

The problem with Vary is that it massively expands footprints and reduces cache efficiency when overused. In a CDN this can create noisy neighbor like issues.

j / k navigate · click thread line to collapse

0 comments

9 comments · 2 top-level

derefr2y ago· 7 in thread

> I assumed once we hit a certain level of traffic it was too computationally expensive?

Seems strange; AFAIK in e.g. Varnish, Vary just means you get more "stuff" tacked onto the buffer that gets built from the request and then hashed to create the cache key.

johncolanduoni2y ago

zoogenyOP2y ago

derefr2y ago

It now occurs to me that even what you're describing wouldn't be enough, because, as MDN says [emphasis mine]:

> The Vary HTTP response header describes the parts of the request message aside from the method and URL that influenced the content of the response it occurs in.

1. collect any additional Vary fields from the response and add them to your cache-policy Vary set; and

And the knowledge required to do all this correctly is really... not knowledge that a middlebox has any good way of acquiring.

This is starting to feel like a design smell in HTTP. Maybe zero-RTT content negotiation is misguided?

What if we instead did content negotiation like this (which — correct me if I'm wrong — would be a mostly ecosystem-backward-compatible change):

With this design, middleboxes could safely trust the client's Expect-Vary header and use it to build the cache key — as long as 406 responses aren't cached.

Something for an RFC, maybe?

zoogenyOP2y ago

phanimahesh2y ago

naasking2y ago

zoogenyOP2y ago

wbl2y ago

The problem with Vary is that it massively expands footprints and reduces cache efficiency when overused. In a CDN this can create noisy neighbor like issues.

j / k navigate · click thread line to collapse