Since it's not clear to me what the consequences are on the modeling here, I'd put this in category C. If lots of people start using it, it could move to B. The ideal would still be A though.
They allow you to get away with much smaller word sizes while preserving dynamic range (and precision!) than would otherwise be the case. This is what the "word size"/tapering discussion in the blog. This is the thing that makes 8 bit floating point work in this case with just a drop in replacement via round to nearest even. You have to change significantly more to get 8 bit FP to work without either the Kulisch accumulator or entropy coding, as you have to make much different tradeoffs between precision and dynamic range.
"Users of floating point are seldom concerned simultaneously with with loss of accuracy and with overflow" (or underflow for that matter) [1]
The paper and blog post consider 4-5 different things/techniques, not all of which need be combined and some of which can be considered completely independently. The paper is a little bit gimmicky in that I combine all of them together, but that need not be the case.
(log significand fraction map (LNS), posit/Huffman/other entropy encoding, Kulisch accumulation, ELMA hybrid log/linear multiply-add as a replacement for pure log domain)
[1] Morris, Tapered floating point: a new floating point representation (1971) https://ieeexplore.ieee.org/abstract/document/1671767