Q^mSat,3^b:d+s+E,4Fri,3^u:h+k+u,6Thu,3^P:j+
If you are effectively going binary, do it. CBOR or Protobuf or any dozen other binary serializations that would be far more efficient.The author claims this is because of copy and pasting… cool, remind me what BASE64 is again?
Being able to copy/paste a serialization format is not really a feature i think i would care about.
another thing, I put in a 400KB json and the REXC is 250KB, cool, but ideally the viewer should also tell me the compressed sizes, because that same json is 65kb after zstd, no idea how well your REXC will compress
edit: I think I figured out you can right click "copy as REXC" on the top object in the viewer to get an output, and compressed it, same document as my json compressed to 110kb, so this is not great... 2x the size of json after compression.
Unless, to read that correctly, it only has a text encoding as long as you can guarantee you don't have any unicode?
No human is reading much data regardless of the format.
What is the benefit over using for example BSON?
(Or to avoid using cat to read, whatever2json file.whatever | jq)
What might be interesting is to have a tool that processes full JSON data and creates a b-tree index on specified keys. Then you could run searches against the index that return byte offsets you can use for actual random access on the original JSON.
OTOH, this is basically just recreating a database, just using raw JSON as its storage format.
This lib keeps the compact representation at runtime and lets you read it without putting all the entities on the heap.
Cool!
It falls down if you have e.g. an array of 1 million small items, because you still need to skip over 999999 items to get to the last one. It looks like RX adds some support for indexes to improve that.
I was in this situation where we needed to sparsely read huge JSON files. In the end we just switched to SQLite which handles all that perfectly. I'd probably still use it over RX, even though there's a somewhat awkward impedance mismatch between SQL and structs.
It's not like you can just tell them to move to protobuf.
If you are working with an end you don’t control, this “newer better” format isn’t in your cards either.
Think of it as a hybrid between JSON, SQLite, and generic compression. This format really excels for use cases where large read-only build artifacts are queried by worker nodes like an embedded database.
My one eyebrow raise is - is there no binary format specification? https://github.com/creationix/rx/blob/main/rx.ts#L1109 is pretty well commented, but you can't call it a JSON alternative without having some kind of equivalent to https://www.json.org/ in all its flowchart glory!
One old version that is meant to be more human readable/writable is jsonito
https://github.com/creationix/jsonito
I'll add similar diagrams and docs for the format itself here.
https://github.com/creationix/rx/blob/main/docs/rx-format.md
Railroad diagrams will come later when I have more time.
Even a technically superior format struggles without that ecosystem.
JSON has `null` values with string keyds, but lua doesn't have `null`. It has `nil`, but you can't have a key with a nil value. Setting nil deletes the key
Lua tables are unordered. But JS and JSON are often ordered and order often matters.
RX, however matches Lua/LuaJIT extremely well and should out-perform the JS Proxy based decoder using metatables. Since it's using metatables anyway do to the lazy parsing, it's trivial to do things like preserve order when calling `pairs` and `ipairs` and even including keys with associated null values.
You can round trip safely in Lua, which is not easy with most JSON implementations.
Sample output:
'fdiscovered,aextreme,7danger,6+1A+16;6level_range,b:QThe Heap ,d'th
Human unreadable, ascii output. Line up and get yours today!
At this point, probably, we have to think how to classify all the "JSON alternatives" cause it gets difficult to remember them all.
Is RX a subset, a superset or bijective to JSON?
I could technically add binary to the format, but then it would lose the nice copy-paste property. But with the byte-aware length prefixes, it would just work otherwise.
This did catch my eye, however: https://github.com/creationix/rx?tab=readme-ov-file#proxy-be...
While this is a neat feature, this means it is not in fact a drop in replacement for JSON.parse, as you will be breaking any code that relies on the that result being a mutable object.
XML has EXI (Efficient XML Interchange) for precisely the reason of getting wins over the wire but keeping the nice human readable format at the ends.
EXI looks useful. Now I just wish there was a renderer in the pugjs format as I find that terse format much pure readable than verbose XML. I also find indentation based syntax easier to visually parse hierarchical structure.
Does this duplicate the name of keys? Say if you have a thousand plain objects in an array, each with a "version" key, would the string "version" be duplicated a thousand times?
Another project a lot of people aren't aware of even though they've benefitted from it indirectly is the binary format for OpenStreetMap. It allows reading the data without loading a lot of it into memory, and is a lot faster than using sqlite would be.
Edit: the rust library I remember may have been https://rkyv.org/
Yes, the format allows for objects to be stored with a pointer to a shared schema (either an array of keys or another object that has the desired keys)
The current implementation is pretty close to ideal when deciding to use this encoding.
Once you get the computational complexity advantage, then you can make it as much times faster as you want. In these cases small instances matter to judge constants, and to the average (mean?) user, mean instance sizes.
I'm not sure how to sell the advantage succinctly though. Maybe just focus on "real-world" scenarios, but there's no footnote with details on the comparison
The benchmark (or is supposed to) measures end-to-end parse + lookup.
JSON: 92 MB RX: 5.1 MB
Request-path lookup: ~47,000x faster
Time to decode a manifest and look up one URL path:
JSON: 69 ms REXC: 0.003 ms
Heap allocations: 2.6 million vs. 1
JSON: 2,598,384 REXC: 1 (the returned string)
sick is binary, rx is textual (this matters for tooling)
sick has size limits (65534 max keys for example. I have real-world rx datasets reaching this size already) rx uses arbitrary precision variable-length b64 integers. There are no size limits anywhere inherit in the format, just in implementations.
sick does not preserve object key order rx preserves object key order, but still implements O(log2 N) lookups for object keys.
etc.
Why is it called RX?
Maybe a better framing would be no-sql sqlite?
Douglas Crockford didn't design it — he said he "discovered" it. It was already there in JavaScript's object literal syntax, which itself traces back to Brendan Eich's 10-day sprint in 1995.
A data format that conquered the internet was a side effect of a language built under absurd time pressure.
Every attempt to replace it has to overcome that kind of accidental ubiquity, which is much harder than overcoming a technical limitation.
Docs are super unclear.
Is it versioned? Or does it need to be..
The viewer is cool, took me a while to find the link to it though, maybe add a link in the readme next to the screenshot.