zstd is Pareto better than brotli - compresses better and faster
brotli 1.0.7 args: -q 11 -w 24
zstd v1.5.0 args: --ultra -22 --long=31
| Original | zstd | brotli
RandomBook.pdf | 15M | 4.6M | 4.5M
Invoice.pdf | 19.3K | 16.3K | 16.1K
I made a table because I wanted to test more files, but almost all PDFs I downloaded/had stored locally were already compressed and I couldn't quickly find a way to decompress them.Brotli seemed to have a very slight edge over zstd, even on the larger pdf, which I did not expect.
I did my own testing where Brotli also ended up better than ZSTD: https://news.ycombinator.com/item?id=46722044
Results by compression type across 55 PDFs:
+------+------+-----+------+--------+
| none | zstd | xz | gzip | brotli |
+------|------|-----|------|--------|
| 47M | 45M | 39M | 38M | 37M |
+------+------+-----+------+--------+Here's a table with the correct sizes, reported by 'du -A' (which shows the apparent size):
+---------+---------+--------+--------+--------+
| none | zstd | xz | gzip | brotli |
+---------|---------|--------|--------|--------|
| 47.81M | 37.92M | 37.96M | 38.80M | 37.06M |
+---------+---------+--------+--------+--------+
These numbers are much more impressive. Still, Brotli has a slight edge.Also, worth testing zopfli since it's decompression is gzip compatible.
pdftk in.pdf output out.pdf decompress file | raw | zstd (%) | brotli (%) |
gawk.pdf | 8.068.092 | 1.437.529 (17.8%) | 1.376.106 (17.1%) |
shannon.pdf | 335.009 | 68.739 (20.5%) | 65.978 (19.6%) |
attention.pdf | 24.742.418 | 367.367 (1.4%) | 362.578 (1.4%) |
learnopengl.pdf | 253.041.425 | 37.756.229 (14.9%) | 35.223.532 (13.9%) |
For learnopengl.pdf I also tested the decompression performance, since it is such a large file, and got the following (less surprising) results using 'perf stat -r 5': zstd: 0.4532 +- 0.0216 seconds time elapsed ( +- 4.77% )
brotli: 0.7641 +- 0.0242 seconds time elapsed ( +- 3.17% )
The conclusion seems to be consistent with what brotli's authors have said: brotli achieves slightly better compression, at the cost of a little over half the decompression speed.that data in pdf files are noisy and zstd should perform better on noisy files?
The pdfs you have are already compressed with deflate (zip).
If brotli has a different advantage on small source files, you have my curiosity.
If you're talking about max compression, zstd likely loses out there, the answer seems to vary based on the tests I look at, but it seems to be better across a very wide range.
I don’t think you’re using that correctly.