That said if you want your Static JSON objects to have comments, just pipe the JSON object through a minifier to strip comments before parsing.
'A recent (and short) IEEE Computing Conversations interview with Douglas Crockford about the development of JavaScript Object Notation (JSON) offers some profound, and sometimes counter-intuitive, insights into standards development on the Web.'
http://inkdroid.org/journal/2012/04/30/lessons-of-json/
{ Thank you Douglas for your vision :) }
https://plus.google.com/118095276221607585885/posts/RK8qyGVa...
This is horrific design reasoning. It's an authoritarian, presumptuous, "punish everyone in the classroom because one child misbehaves" mentality.
Comments would be useful in JSON because comments are useful in code, and JSON is code. For example, I might have a config file that I'm typing in that I want to leave a documentation trail for.
Don't tell me I can do a silly thing like redefine a field, as if it's "neat". It's an abomination that I have to resort to such things. And guess what: by resorting to such things I can still do precisely what Crockford claims he was trying to prevent. So his rationale is not only insulting to one's intelligence, it's sheer stupidity.
Which is pretty much what a specification is.
It's one or more people saying "This is how things are if you call them X".
> presumptuous
Presumptuous? It was in response to the feature being abused!
> "punish everyone in the classroom because one child misbehaves" mentality
No more than creating laws is. A significant subset of the population are misusing it in such a way as could cause widespread damage. It is a minor inconvenience to the 'law abiding people' (particularly given than any comments would be removed if read in and spat out by any program). There are workarounds ("field_comment":"some comment") or if that's not enough, use another format. Use one that allows comments, there are many.
> Don't tell me I can do a silly thing like redefine a field, as if it's "neat". It's an abomination that I have to resort to such things
It's also completely unreliable, it's a terrible solution and nobody should use it. I think we're fully in agreement here.
> And guess what: by resorting to such things I can still do precisely what Crockford claims he was trying to prevent. So his rationale is not only insulting to one's intelligence, it's sheer stupidity.
No you can't. The point was to stop people adding pre-processing commands or other such things to json, which would be in random formats and invisible to some parsers (as comments should be), visible and important to others. You don't want to pass a valid piece of JSON through a parser and end up with two different outcomes dependent on something in a comment, do you? Or have to use parser X or Z because Y doesn't understand directive A, but it does understand directive B and C, and while Z understands C, and X knows B, Z doesn't, so I have to use the version from a pull request from DrPotato which I think supports...
What I'm saying is that there is a benefit in simple standards.
JSON is data. It appears to be JS code, but JSON is data. Data is not code ( http://www.c2.com/cgi-bin/wiki?DataAndCodeAreNotTheSameThing ). That's why the idea of data holding parsing directives is silly. If you want to do that, then embed that in the data (hold a MsgType key in the data records). There's no need for comments unless you are trying to use it for something other than raw data.
I do not presume to know who you are, or what you have accomplished, but there are few people with the professional and academic background that qualify to be able to call Douglas Crockford "stupid".
1. In my experience JSON is frequently output programmatically, and taken in programmatically. Comments are not useful in these cases.
2. The only time comments could be perceived as useful then would be when parsing JSON by eye or hand. However, it is not difficult to parse JSON and understand it unless the keys have used obfuscated names. If key naming is obfuscated, comments aren't really the correct solution.
3. "An object is an unordered set of name/value pairs", as mentioned by jasonlotito and others earlier. There is no guarantee that a JSON parser will give you the right value if there are two of the same keys in the same scope.
In fact, reading the RFC:
> The names within an object SHOULD be unique.
I'm pretty sure an implementation could refuse to parse the form altogether.
I know there is a lot of JSON handling that happens behind-the-scenes, but there is also a non-trivial amount of JSON that I have manually created and/or altered, and have to share with a team.
It's a blessing and a curse, these modern NodeJS projects -- it's awesome that I can simply create/modify a .json file with a few properties, run a command, and magic happens. However, if I want to try and communicate out the intent of the values to my team of 20+, it becomes really convoluted. The projects all magically work by looking for foo.json, but if I comment that file then it breaks.
So I have to create another foo.comments.json, add another script that will remove the comments and then call the original instructions. Then I need to create additional documentation instructing the team to ignore the developer's docs regarding native use, and to run the application with our own homebrew setup.
It also can make testing a pain in the ass, because now I can no longer comment out values, I have to remove them completely. Not a huge deal, annoying nonetheless.
For the past few years, I've generally been using either apache-style via http://p3rl.org/Config::General or some sort of INI derivative (git is proof that ini is good enough for a lot more things than you might expect).
For the future, ingy and I have been working on http://p3rl.org/JSONY which is basically "JSON, but with almost all of the punctuation optional where that doesn't introduce ambiguity" - currently there are perl and ruby parsers for it, javascript will hopefully be next.
Admittedly, we -haven't- got round to defining a format for comments yet, but my point is more "JSON wasn't really designed for that, let's think about something better".
The advantage I see in this way of commenting is that the comment becomes accessible inside the program instead of being stripped off by the parser. For the human reader it's also more obvious.
Unfortunately, it's not possible to add comment to anything else than objects. But the OP's proposal as well.
Putting comments into JSON in this way is a hack and shouldn't be used by anybody who has any interest in writing maintainable software. Relying on ambiguities in an RFC and someone saying "JSON parsers work the same way" is a good way to end up with a really obscure bug in the future.
It still does not feel right.
The parsing behavior for JSON is not defined at all in RFC 4627, actually. Browsers (and Node, since it's using a browser js engine) use the parsing specification in ECMA-262 edition 5 section 15.12.2.
Note that ES5 section 15.12 in general is much stricter than RFC 4627, as it explicitly points out if you read it.
JSON is like duc(k|t) tape. It's really easy to stick two things together with it. That doesn't mean you always should. It's the simple thing that gets the job done so you can focus on what matters.
One shouldn't pick JSON for your config files and then hold it up as good design. "Look at me, I'm daring and _not using XML_!" Using JSON is crap design, but good engineering means sometimes picking something crappy and not wasting effort on things that don't matter in the end.
If your configuration files become both complicated and important enough that you need comments, then you should stop using JSON. If your duck tape job starts needing additional reinforcement, then you should probably just get rid of the duct tape and do it right.
If one of your requirements is a sufficiently trendy yet commentable config language, look into YAML. Also, gaffer tape. The white kind is easier to write on.
Actually, I'm 100% playing the devils advocate here. I'll even flip-flop to prove it. Regarding the article, I doubt that every JSON parser will let this slide. To me that's an even better reason to avoid this practice.
If someone uses undefined behaviour in config files for the sake of storing a comment, I reserve the right to hunt them down if I have to maintain their code.
The names within an object SHOULD be unique.
SHOULD is defined (http://www.ietf.org/rfc/rfc2119) as 3. SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.
Salient point is that you would need to ensure that you are only using JSON parsers that tolerate duplicate names (and use the last value)To drive this home a bit more forcefully, it requires knowing the behaviour of your parser where it is marked as "undefined" in the spec.
If that isn't enough to stop you, DON'T USE JSON. A patch level change in a library could break your code in a non-obvious way and it would be your fault. If you want comments, DON'T USE JSON, JSON DOESN'T HAVE THEM.
JSON works great for on the fly communication with frontends that are running JavaScript, or for communication between JavaScript processes like Node.js servers. But for configuration files and other things that need comments YAML is many times better, both for it's clean, Markdown reminiscent structure, and its native comment support.
Node.js has a great module called js-yaml (https://github.com/nodeca/js-yaml) which automatically registers handlers for .yml and .yaml files, allowing you to require them in your Node.js code just like you can with JSON files.
It also comes with a YAML parser for the browser side of things, so if you want you could even communicate YAML directly from the server to the client side, although frankly I don't see much advantage to sending YAML over the wire instead of JSON. (And as others have mentioned below untrusted YAML sources could insert malicious objects in YAML, so I wouldn't recommend this technique.)
You can even use YAML for your package.json in a Node program: (https://npmjs.org/package/npm-yaml)
There's the famous Rails vulnerability due to YAML. Python needed to add 'yaml.safe_load'.
YAML is a little too rich. It's always one poorly thought out convenience feature away from disaster.
It has parsers for nearly every language, I wrote one for js: http://npmjs.org/package/tomljs
YAML is easy to type, even with the whitespace. So is INI. And as verbose as XML is, it's easier, ime, to type than JSON. Of those four, JSON is the hardest to write by hand; certainly it's the one I make most mistakes with, to extent I have a particular technique for writing it out (prefixing the commas). As a result JSON as a config file format is tedious, verbose, and error prone; its sweet spot is a machine interchange format that a human can debug/read if needed.
Rails RCE, sup
But I do like the Rails convention of using YAML format and have adopted that in my own code as much as possible.
Also, many of the security holes in YAML come from its use as a serialization format which can represent native classes. I wish the YAML parsers had more explicit support for simple data schemas which would reduce the security risk and be sufficient for most configuration files.
For -configuration- you want a simpler format; INI is worth considering, as is http://p3rl.org/JSONY which is ingy's implementation of a vision we thrashed out for a more sysadmin-friendly config format.
+1 re YAML
Even with indentation problems, the time saved in not typing curly brackets, extra quotation marks, and commas, and the time saved in not having to visually parse these when reading YAML more than makes up for the occasional data structure bug caused by bad indentation.
This seems like a bad idea. It seems heavily reliant on edge case behavior. But hey, might work well for the original author.
Nope, parsers are perfectly in their rights to do whatever they want with multiple keys. They could read them backwards, sort them, whatever. The behaviour in the instance of multiple keys is undefined.
> This seems like a bad idea.
It is an astonishingly bad idea. I'm concerned by it being so high on the page.
> But hey, might work well for the original author.
Depends on their parser. It's undefined behaviour according to the spec. It might work now, but I'd argue it doesn't work well, as a patch level change could bork this.
* the fact parser work from top to bottom of the text
AND
* the fact that assigning the same key many times with different values update the key with the last value
your quote regards the order in witch the different keys are saved.
It's ridiculous that I can't document notes on dependencies in my NPM package.json, or add a little reminder to my Sublime Text configuration as to why I set some value, because we're using JSON parsers that can't handle the concept of ignoring a line with a couple slashes prefixing it.
IMO - either we add comments to JSON, or we stop using it for hand-edited configuration.
Why not have
{ "keyname" : "aldkjfhaldhfa"
"keyname_comment" : "asdfjnad" }
If that's not enough, use something other than JSON. Adding comments will just result in it being valid in some parsers and not others.Regardless, of course, people add metadata to JSON already - there's zero reason you can't "_type": "int". It's a completely arbitrary reason.
Bing, Bing, Bing. We have a winnar!!!
XML sucks in large part not because of XML but because people used it for everything, everywhere in places it was highly ill-suited. Don't fuckup JSON the same way.
s/^#.*//g
or yaml.safe_load(json_file_with_comments)This guy is fast. Especially nice considering we do not know each other at all.
[1] http://www.jslint.com/ - JS checking tool from the inventor of JSON
He responded that he was getting annoyed by everybody asking for this, so it was going to cost me $100K to obtain such a license.
I responded that I only asked for that license in order to annoy him (and thanks for the confirmation that it worked), because his immature license clause is annoying everybody else.
{
"myvalue_comment": "This is a comment",
"myvalue": 42
}It is already hard to read as is and it's making it worse to read and confusing, if some big service would start using this, you would have to know about this 'hack' otherwise he would have to look up what the hell is going on.
Also, this is the same information for each call and thus redundant, makes your messages larger when an advantage of JSON is that it's generally a small message.
Switch to a different JSON parser, does it still work? probably. but I wouldn't bet that much.
If I were implementing a JSON parser, might I throw an error on a duplicate key? maybe. Maybe I would just print a warning?
If I were every going to give someone advice it would be to never do this.
Also, it's not defined in the JSON standard in which order an implementation needs to parse the JSON fields/keys. So you could end up with potentially wrong results!
Please don't do this. There's almost certainly some parsers out there currently that don't work like this, and if not, there likely will be one day.
{
"#": "this is a comment for the next line",
"url": "http://foo.bar"
}
Simple.It does break the json parser in the Go standard library, in a totally nonobvious way: http://play.golang.org/p/BsDd47vWna
I would be surprised if it doesn't break many parsers, especially json parsers in static languages. If you want that sort of behavior, don't use json.