This CBOR format is being proposed by the VPN Consortium - presumably there's some specific VPN interoperability application they have in mind for this. In the meantime, everybody else will continue to use compressed JSON, or protocol buffers, or whatever other standards have good library support and interoperability and - crucially - adoption in their domain.
-a lot of the time, a dearth of implementations of a new Thing is not because the new Thing is bad, but simply because people are change-averse and lazy, even in the face of an objectively better Thing, and
-I still consider this a quality submission; even if CBOR doesn't get adopted it's still neat to read. It's like watching one's government draft new legislation, except more relevant.
the length field for compound types (arrays and maps) specify the length in "the number of items", not in bytes. This means while processing, If I need to skip a compound type, I actually need to process it in its entirety. Not very "small device" friendly.
In practice, I have found far more utility in knowing the byte-length of a compound field in advance than the number of items it contains. If I am interested in the field, I am anyway going to find out the number of items cause I am going to process it. If I am not interested in the field, the number of items are useless to me, but the byte-length would have come in handy.
That seems like something that's going to come back and byte us.
I'm not sure I understand the problem you describe, really.
Even if there are string, just encode their lengths, or if you store a compound type, write the size when size can vary.
Which is strange for a thing calling itself 'concise'.
[BSON] is a data format that was developed for the storage of JSON-
like maps (JSON objects) in the MongoDB database. Its major
distinguishing feature is the capability for in-place update,
foregoing a compact representation. BSON uses a counted
representation except for map keys, which are null-byte terminated.
While BSON can be used for the representation of JSON-like objects on
the wire, its specification is dominated by the requirements of the
database application and has become somewhat baroque. The status of
how BSON extensions will be implemented remains unclear.I'm a little bit tired (well, more than a little tired) of standards that aren't couched in terms that are directly executable. English descriptions and psuedo-code are fine, but in the end I want to have some working code that implements an API for the stuff. Doesn't have to be an official API, but something usable shows me that (a) it is indeed usable, and (b) will go a long way towards heading off other people's mistakes.
We don't do crypto without test vectors. I don't know why we think we can do other complex standards without test vectors, either. (I worked on NBS / NIST in the 70s on some verification suites. Have we lost that practice?)
I think that much of what is busted on the modern web can be traced back to loose english and lack of reference code (even stuff with placeholders). CSS, HTML, etc., I'm looking at you... :-/
Why? I can see the advantages of either one, but I don't see what having both gets you.
In my experience the implementation advantages of having length-prefixed lists disappear if you have to support indefinite lengths anyway.
- Passing small messages around
- Doing streaming of large content (occasionally)
I'm probably doing these over different pipes, but the data shares a lot of the same characteristics and I don't want to use two totally different APIs to get the job done.
"Large" can be "I need to transfer something on the order of megabytes using a 4K intermediate buffer."
The lack of the string "UUID" in the RFC is also cause for concern.
But, most importantly, use of integers for datetime values hides type-level semantics. It's just integers and you, the end user, and not the deserializer, is responsible for handling the types.
I think it's quite inconvenient to do tons of `data["since"] = parse_datetime(data["since"])` all the time, for every model out there.
Of course, in real-world implementations, the encoder and the decoder will have a shared view of what should be in a CBOR data item. For example, an agreed-to format might be "the item is an array whose first value is a UTF-8 string, second value is an integer, and subsequent values are zero or more floating-point numbers" or "the item is a map that has byte strings for keys and contains at least one pair whose key is 0xab01".
7 is 7 whether it's uint_8 or uint_32, right?
For constrained
applications, where there is a choice between representing a specific
number as an integer and as a decimal fraction or bigfloat (such as
when the exponent is small and non-negative), there is a quality-of-
implementation expectation that the integer representation is used
directly.I would like to see how this compares to other formats with respect to serialised size...
In short, this is because JS does not treat NaN and Infinity as numerical constants but as pre-defined, mutable variables; this way backward-compatible parsing of hypothetical sane JSON with eval would be vulnerable to injection. Nevertheless, many JSON codecs have their own idea what to do with it, so this stuff can get really nasty.
Just got the encoder so far (without major type 6, i.e. tagging) and the code is pretty messy and possibly not 100% correct, but it's true that the amount of code required is pretty minimal.