I actually like tnetstrings for backend messaging, but I don't see it used very often. json is pretty damn ubiquitous these days.
If you are decoding a JSON string, you run over the string twice: first to search for the terminating quote character so that you know how much memory to allocate, and second to copy the string.
Whereas with msgpack, it's more like pascal-model strings where you know the length of the string up front; you can do it one pass. You also get some structural hints that make more of the parsing a parallelizable problem, whereas with JSON it is very difficult to get any benefit out of more cores.
This may not make any difference if you are serving up an API request or three to x86 computers with broadband, but they make a lot of difference if the data you're working with is large relative to the bandwidth or latency of the pipe or the CPU power available on the other side of it.
> For me the advantage of msgpack is simplicity / efficiency in parsing the response
If you like that, you might want to take a look at tnetstrings as a format.Edit: here are my results, which lack the numbers for msgpack except a mention at the end (I hate linking to Posterous, but I haven't moved my blog yet): http://nitrogen.posterous.com/164964342
Ok I gotta ask.... Why on earth would you do that? It seems like an exercise in masochism. I wasn't even aware that ruby would compile on embedded arches.
Mostly I wish people would agree on a schema and use ASN.1 PER, rather than choosing from the ever-growing list of binary type-length-value formats which put a redundant copy of the schema on the wire in every message (making them neither small nor readable). I've never had occasion to do anything useful with a message whose format wasn't already known when I wrote the code.
FWIW, I do very little "web work" (by which I assume you mean front-end?). Instead it is mostly api endpoints for mobile and server-to-server backend messaging. Lots of http transport stuff. If I were doing rpc, I would either not (and use a restful or type-2 api), or use something like protobuf or thrift.
Msgpack just doesn't seem, to me, as a really great fit for anything in particular.
I ran into this issue a few months ago, on a cross-platform project involving four languages that each take a distinctly different view about strings from the other three. Although this situation is a common objection to supporting strings in the issue thread, it took just a couple of hours to extend msgpack to support strings in a reasonable-enough-for-me way on each platform.
The proposals in the thread are a lot better than mine. And I suppose it's pretty antisocial / arrogant for me to just roll my own implementation without consulting anybody. But in three years[0] of talking about the problem, nothing had gotten done. Meanwhile, my code shipped a long time ago.
I do this a lot--fork people's projects to solve my problems and don't merge back changes--and I feel guilty for not being more participatory with the project maintainers. But the fact is that the expected cost of getting embroiled in a flamewar like this is high (whether it is over architecture, whitespace convention, "behavior by design", "Jim's already working on that", etc.), whereas the benefit to me of getting my changes merged upstream is essentially zero. So my antisocial behavior continues to be positively reinforced.
Does anyone else have this problem? Or do people just enjoy flamewars more than I do, or have the persuasive skills to avoid them?
A lot of the time the response is very welcoming. E.g. I recently provided a substantial patch to Beaneater (Beanstalkd client library for Ruby) and the maintainers were all over it immediately, and we got it merged in quickly.
The benefit of taking the effort is to be able to keep up with upstream without having to reapply patches. But that benefit is limited (often I will prefer to stay with an "old" known entity rather than tracking upstream, as long as security concerns don't force me to upgrade), and so I don't spend a lot of time pursuing it.
I strongly believe code speaks louder than words in this kind of situation, and often shipping code will be more likely to get acceptance than engaging in discussions.
Last week it was needing Redis slaves to handle 'SLAVEOF NO ONE' from the master without crashing. Needed to tell all read-slaves (hundreds) to stop trying to reconnect when taking the master down.
It's fine balance though, you don't want to be stuck with too many forks to maintain.
The bigger problem is that Msgpack is advertised as being "like JSON, but fast and small." To me, that makes it sound like I can replace JSON messages with Msgpack messages and be done, and that's not at all the case, because I need to add a schema layer. I think the "like JSON" comparison is what is really causing this frustration with the format.
You might be misinformed re schema layers, as msgpack does convert to and from a dictionary in much the same way that JSON does, keeping all your dictionary keys (which are strings) intact. In fact, we originally used it as a drop-in replacement for JSON.
As for the data vs string issue, it was designed originally to be as conveniently similar to JSON as possible - which is why you don't get back a dictionary full of NSData's which you then need to convert manually to NSString's; it does that automatically for you. This was a convenience vs correctness tradeoff. People who say it's wrong are quite right. They're very welcome to fork it, or submit patches with options to return raw NSData, or create a new wrapper - it wouldn't take a competent dev very long to re-write what we did.
Now, i've not used messagepack in quite a while - i've simply found that gzipped json is usually almost as good.
The conflict is unresolvable until the participants agree on which of these two distinct things msgpack should be.
* The people who use it to implement protocols already have to deal with types, e.g., expected a number but got a string. So one more type is not a big deal.
* The people who use it to create discoverable profiles will... use JSON no matter how good MessagePack gets.
That's not the direction I was headed when I started writing, but I don't think the first group you mentioned even exists.
Really, for a protocol that values minimal space usage, not defining a string type is probably a good thing. Use the one that produces the fewest bytes in your application - it may not be UTF-8.
Also:
>For instance, the objective C wrapper is currently broken because it tries to decode all raw bytes into high-level strings (through UTF-8 decoding) because using a text string (NSString) is the only way to populate a NSDictionary (map).
Well there's your problem: https://github.com/msgpack/msgpack-objectivec/blob/master/Me... It's a buggy wrapper that's trying to be convenient. And NSString keys are by no means the only way to populate an NSDictionary, and it doesn't look like the Objective-C wrapper requires this: https://github.com/msgpack/msgpack-objectivec/blob/master/Me...
The discussion is pointless if the objectives of the participants differ and none is willing to compromise.
EDIT: A common example of implicit and wrong handling of character encoding is when a file gets created with invalid characters, and your Linux file manager is unable to delete it. This can happen because the file manager assumes the file names it gets from the OS are text, and which it decodes incompletely. When it wants to delete the file it encodes the text back, but the result is different than the original file name bytes. The error happens because the file manager tries to decode the as text too early - it should keep the original octets as a reference to the file, but only decode them when it needs to display a file name.