With this new-line-delimited JSON format all your clients HAVE to know about your new protocol. They have to stream the response bytes, split on new lines, unescape new lines in the payload (how are we doing that, btw?), etc. If a client doesn't care about streaming, it can't just sit on the response and parse it when it's done coming in. Or, how about if later on you upgrade the system so that the response is instant and streaming is no longer necessary? Then you move on to a new API and have to keep supporting this old streaming-but-not-really endpoint forever.
With line delimited logfile objects, it's easy to grep for a string of interest and then only parse the lines that match -- much more efficient than parsing an entire logfile to pick out 0.01% of the lines.
That's not the usecase talked about in this article, but it is a usecase that's important to me.
Yeah, and I agree with you 100%. But lot's of things that are great for log storage aren't appropriate for an API.
Also, if you’re going to reinvent the wheel and make a custom framing format, why would you choose a delimeter that can legally appear in your content? Separating your JSON with newlines is complete madness. If you’re sending UTF-8 then you can trivially use a byte that cannot appear in the data, like 0xff, as the divider.
Just to fill in the picture here: it's because the built-in parser has a very good chance of being faster than anything you could write[1]. In addition, it's code that you don't own and don't have to maintain.
[1]: with exception to https://news.ycombinator.com/item?id=16413917
*Thank you for NSBlog, it's great!
JSON hasn't been designed for chunked interpretation. How would the client know when to start interpreting the received message? How could you tell the difference between a valid chunk or malformed response?
>With this new-line-delimited JSON format all your clients HAVE to know about your new protocol
Yes. It's probably not a good option for public. It should be a complement to a more general API. I would put this in the bucket of micro-optimization for very specific use-cases. Regardless, if you publish your API and document it, other developers should be able to consume it just fine. It's not rocket science.
Neither has XML and yet we have plenty of working streaming XML parsers.
> How could you tell the difference between a valid chunk or malformed response?
Same as usual, when it breaks.
It's important to remember why we use JSON. It's not because it's well suited to transmission. It's because it's easy to reason about. If you want to move to a streaming format that is not easy to reason about, you may as well move to a binary format.
The other thing was that to get the full benefits of something like this (not necessarily about streaming parsers vs not) you had to rearrange the way your /whole/ stack works, streaming all the way from the database through all the backend layers to the frontend. It's satisfying when it works, but definitely a non-trivial amount of change.
Which is to say I don't have an answer for why streaming JSON isn't valid in this case, but I can also say that if it was up to me I would never us it in an application that mattered. It is much too expensive for many (most?) applications.
Another reason might be that when working with hand-written files, the delimiter adds a bit of redundancy. This make errors like unbalanced parentheses easier to diagnose.
Right now, it takes over 5 seconds(!!) for this page to load because of all the freaking JavaScript it has to download! With JS off, the page loads almost immediately. With a keep-alive connection, subsequent loads over HTTPS are not particularly long, unlike what this article seems to think. (Hacker News is one of the FASTEST sites I can access, for example. Even on my crappy connection, pages load nearly instantly.)
Simply letting me type, press enter, and wait 0.1~0.3 seconds for a new page response would not be a significantly worse experience -- however, due to the way the site is written, search doesn't work AT ALL with JS disabled.
So, lots of engineering effort (compared to just serving up a new page) for little to no actual speed improvement, and a more brittle website that breaks completely on unusual configurations... Yeah. Please don't do this!
It uses the streaming API, and will work well for you.
Most likely the page loads instantly because the server is not doing any real work, which is offloaded to the client/js.
Looking at the behavior with curl and wireshark, what I see is that a full, new connection to the service does spend most of its time in DNS lookup and HTTPS handshake. It takes about 0.1~0.3s for the actual data to transfer.
What the article is recommending is basically, don't make a request per JSON object. (It streamed back 69 objects for the query I tried.) Using one connection to transfer the information saves a lot of overhead -- and I don't have a problem with that part of the advice.
What I mean is, instead of using JS at all to do this (and consequently triggering a 5 second initial page load, etc etc.), have the server build the page the traditional way, and send that -- that's still one connection for that data transfer, and with a light page design and keep alive connection, the page load time does not seem like it would be significantly different here (most of the time is going to be in that 0.1~0.3s for the query to execute regardless) -- but the initial page load time would be significantly faster on slow connections.
If your queries do actually take many seconds, sure, maybe there would be a benefit there, but I'm not seeing the value on a page like this, and I really don't want people to take away the idea here that they should redesign their sites to use AJAX to "reduce latency on mobile" by default as it won't help, and in fact, tends to make things worse.
Are you saying that at one point the servers would crash relatively often, which would leave sockets clients hanging, unless some complicated client-side code was written - whereas without sockets, a load-balancer could automatically switch clients to functional servers, without extra coding, and mitigate the issue? Isn't the problem the crashy servers?
JSON.parse() only accepts strings.
The library that the article recommends also uses XMLHttpRequest with strings. [2]
The reason I'm asking is the maximum string length in 32-bit Chrome.
[1]: https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequ...
[2]: https://github.com/eBay/jsonpipe/blob/master/lib/net/xhr.js
[0]: http://oboejs.com/examples#loading-json-trees-larger-than-th...
For instance, their iOS app weighs 888.8 KB! When it's common for simple apps to be 50 MB monsters, it's very refreshing to use something that has been developed with proper care.
IMHO the most reliable way to get data from point A to point B is likely by having a client actively polling for data, using a strict socket timeout. Data should be at-least once delivered. If JSONS should be called anything remotely "reliable" as periodically polling, at least it should have a strict timeout (not mentioned in the article) for receiving the next newline & it should handle replaying of non-acked messages. Otherwise I would call it far from "reliable".
I think streaming would be useful only if the responses are stateful and it's hard to share it across requests.
$ curl -H "Accept-Encoding: gzip" --trace - "https://instantdomainsearch.com/services/vanity/apple?hash=8...
The use-case was we had a slow database query for basically map pins. The first ones pins come back in milliseconds, but the last ones would take seconds. The UI was vastly improved by streaming the data instead of waiting for it all to finish, and the server code was easy to implement.
A different delimiter would have worked, but newlines are easy to see in a debugger.
At one job I had several years ago we came up with the same idea and use \n separated JSON elements as a streaming response. We also tossed around the idea of using WebSockets to stream large responses between services.