https://blog.nella.org/2016/01/17/seeking-http/
(Originally written for Advent of Go.)
Tangential, but any Free Software that uses `shared-mime-info` to identify files (any of your GNOMEs, KDEs, etc) are unable to correctly identify Zip files by their EOCD due to lack of accepted syntax for defining search patterns based on negative file offsets. Please show your support on this Issue if you would also like to see this resolved: https://gitlab.freedesktop.org/xdg/shared-mime-info/-/issues... (linking to my own comment, so no this is not brigading)
Anything using `file(1)` does not have this problem: https://github.com/file/file/blob/280e121/magic/Magdir/zip#L...
[1] https://gildas-lormeau.github.io/zip.js/api/classes/HttpRang...
[2] https://github.com/gildas-lormeau/zip.js/blob/master/tests/a...
1) The format has limited and archaic support for file metadata - e.g. file modification times are stored as a MS-DOS timestamp with a 2-second (!) resolution, and there's no standard system for representing other metadata.
2) The single-level central directory can be awkward to work with for archives containing a very large number of members.
3) Support for 64-bit file sizes exists but is a messy hack.
4) Compression operates on each file as a separate stream, reducing its effectiveness for archives containing many small files. The format does support pluggable compression methods, but there's no straightforward way to support "solid" compression.
5) There is technically no way to reliably identify a ZIP file, as the end of central directory record can appear at any location near the end of the file, and the file can contain arbitrary data at its start. Most tools recognize ZIP files by the presence of a local file header at the start ("PK\x01\x02"), but that's not reliable.
I think the general pattern - using the range header + prior knowledge of a file format to only download the parts of a file that are relevant - is still really underutilized.
One small problem I see is that a server that does not support range requests would just try to send you the entire file in the first request, I think.
So maybe doing a preflight HEAD request first to see if the server sends back Accept-Ranges could be useful.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Ran...
For static files served by CDNs or an "established" HTTP servers I think support is pretty much a given (though e.g. Python's FastAPI only got support in 2020 [1]), but for anything dynamic, I doubt many devs would go through the trouble and implement support if it wasn't strictly necessary for their usecase.
E.g. the URL may point to a service endpoint that loads the file contents from a database or blob storage instead of the file system. Then the service would have to implement range support itself and translate them to the necessary storage/database calls (if those exist), etc etc. That's some effort you have to put in.
Even for static files, there may be reverse proxies in front that (unintentionally) remove the support again. E.g. [2]
[1] https://github.com/Kludex/starlette/issues/950
[2] https://caddy.community/t/cannot-seek-further-in-videos-usin...
I'll dig up a link.