The thing where you can leave quotes off strings makes me nervous, especially the example where the value is HTML with its own embedded double quotes for attribute values.
Not requiring quotes on strings like that looks like an obvious vector for injection attacks. I guess Hjson isn't designed to be generated automatically, but I'd prefer a format that is easy to generate safely.
What I really want is JSON plus comments plus multi-line strings plus relaxed rules on trailing commas... While maintaining as simple and unambiguous a parsing model as possible.
foo: 4,
...produce the value 4, "4", or "4,"?While I have been hoping for "JSON plus comments" to be a real and common thing for quite a while now, one of the strengths of JSON is right there on json.org. See it? A set of five simple syntax diagrams that entirely and virtually unambiguously define the json syntax.
It’s tough to know when to stop when simplifying a syntax. (For an extreme example, see Stylus, which was like Sass but so extreme that mixins and properties became ambiguous for each other.) I, too, would like to see the return of quotes for string values, for increased clarity.
> When you omit quotes the string ends at the newline.
> Preceding and trailing whitespace is ignored as are escapes.
>
> A value that is a number, true, false or null in JSON is
> parsed as a value. E.g. 3 is a valid number while 3 times is
> a string.
(edit: formatting)The site even has an ambiguous example: three: 4 # oops Well, is that "4 # oops" or 4? I saw no rule about ending comments, and they say that three: 3 times is a string.
I have seen no formal specification of the grammar, so we already have lot of ambiguity. Good luck to implementers...
Seriously, removing the "no quotes needed" rule would improve greatly the format. If you want to include HTML with double quotes literally, just use the multiline string format and be done.
> Numbers can include Infinity, -Infinity, NaN, and -NaN.
E.g.
x: yes
>>> str(yaml['x'])
yes
>>> bool(yaml['x'])
TrueLearn from Perl. The quote operator is your friend (and I frequently lament it's omission in Bash). You could simplify it by not using the matching enclusures ({ and }, [ and ], etc). It's easy to parse. and if you keep the quoting character somewhat rare, it's not hard to read.
E.g.
{
"string" : "A string without inner quotes",
"quotes1" : q!A string "with" inner quotes!,
"quotes2" : q|A string "with" inner quotes|,
"quotes3" : q@A string "with" inner quotes@,
"quotes4" : qTA string "with" inner quotesT,
}
Edit: To be clear, I wish JavaScript had a quote operator, and JSON started with it. :/1: http://perldoc.perl.org/perlop.html#Quote-and-Quote-like-Ope...
This, on the other hand, is a 'solution' to escaping quotes that is completely mad. Using non-standard quotes, especially mixing and matching them is a disaster for readability and maintainability (using a T in your string now? need to change the quotes!). Triple quotes are just find if you want to avoid escapes, and hjson seems to support them.
Trailing commas and JSON comments are are already supported in the newer browsers (try the Chrome console for instance).
Fortunately quoteless strings or optional-commas/newline-separator as proposed in Hjson will never fly. They are brittle and ambiguous. Who knows what will this get parsed as:
{
a: hello's and hi's have
'misplaced' apostrophes
b: ball: a round # and # bouncy object
c: cakes and
candy: both have sugar
# but how do I include a hash at the start of a multiline-unquoted string?
}Version 49.0.2623.112 (64-bit)
> JSON.parse('{"foo": "bar",}')
> VM124:1 Uncaught SyntaxError: Unexpected token }
Javascript object literals != JSON. JSON is a restricted subset of JS object literals (and not actually a strict subset: a JSON string can contain unescaped U+2028 "LINE SEPARATOR" and U+2029 "PARAGRAPH SEPARATOR" codepoints, a Javascript string can not)Take a look at example 2.11 in the YAML spec [2], for example, and see if you can make heads or tails of it.
[1]: https://pairlist6.pair.net/pipermail/markdown-discuss/2011-A...
"YAML expresses structure through whitespace. Significant whitespace is a common source of mistakes that we shouldn't have to deal with."
Primary goals were to remove as much syntax as possible and make it play well with line-based diffs (with the hopes that someone who knows knowing about the language could resolve conflicts without getting tripped up by surrounding quotes, trailing comments, etc).
Granted, if the number of conflicts which cannot be automatically resolved is reduced by enough, then it might not matter in the grand scheme of things. However, I'd be worried that this would make "accidental" automatic resolution of semantic conflicts more common. That may be an unfounded/irrational fear, I don't know.
> Both HOCON and YAML make the mistake of implementing too many features (like anchors, sustitutions or concatenation).
Also this claims to not need escapes, but it's also not clear how this format handles a comma or a newline in strings without escaping, do they act as a comma to separate properties or do they act as natural commas/newlines?
That's why I remove those features.
Significant whitespace is a normal complaint for beginners in python too, but most people prefer it in the end.
I like how nice config files can look with YAML and JSON being a subset of it makes it even more convenient
JSON spec: http://www.ecma-international.org/publications/files/ECMA-ST...
If you haven't read the JSON spec and you use JSON I recommend doing so. It takes five minutes. My personal favorite line: "Because it is so simple, it is not expected that the JSON grammar will ever change."
If you want to annotate JSON in documentation, I say "go ahead and just use //". Any programmer reading it will understand that those lines are taken to be comments and they shouldn't type them in their final request.
{
"__comment": "The following config does...",
"key": "value"
}
But I agree that it is not much intuitive.[0]: http://json5.org/
It's not a superset, so no extra add-on features need extra doc, and it's less of a subset then JSON so the rules of what's disallowed are much simpler.
But, if comments really are needed, another easy way to have comments is have a file that rides to the side of any json files or docs. Sometimes we use a markdown/text file next file.json -> file.json.md / file.json.txt to describe overall or a file.meta.json that has comments per key. This is only needed sometimes for physical files. If json is from the server, commenting can be done there or in docs if needed.
JSON and YAML are interchange formats, not configuration formats. Rather than than hacking up an interchange format, it's probably better to use something designed for configuration formats, like TOML.
As for TOML, it's a good replacement for mostly-flat INI-style files but the syntax is really awkward for the kinds of places you'd normally use YAML, especially nesting lists/maps.
What It Is: YAML is a human friendly data serialization standard for all programming languages.
JSON might often be too rigid, but I think it's important to note that "easier" (in that you don't need to learn the syntax) isn't always better.
BTW. it is very simple to do comments in JSON :) You can just add "comment1" : "This is my comment", to any object, it will be ignored by software that processes your file.
For an interchange format, JSON does the job very well. Small, simple, human readable, easy to implement.
For a configuration format, JSON leaves a lot to be desired. It's almost there, but has enough warts to be annoying.
You're not going to get a one-size-fits-all format.
[ "a", "b", "c", ] // the silicon valley comma
[
"an
, "array"
, "of"
, "things"
]
Because I have an irrational hatred of that style. (Yes, I know the purported benefits when diffing files, I don't care :P)SVC allows you to both prepend and append without adding extra diffs
I know it's supposed to be a config format, but it only seems to make any sense for INI-like configs that are little more than a flat key-value map.
The places I see people using JSON/YAML/etc for config are much more likely to have nested structures that would be extremely awkward to represent in TOML. I think YAML was on the right track, and if you ignore the messier parts of the spec it works pretty well.
I personally don't mind YAML all that much either, although the spec is pretty large.
toml requires you track down a toml parser, at the very least
Maybe that's an argument for languages to start adding some configuration format other than XML into their standard libs.
- It works great, lots of people start using it
- People start adding features to fix annoying things with the format, add support for binary data, comments, schemas, add more metadata etc..
- Many versions proliferate, people start writing converters and verifiers
- A standards committee is formed and write an 800 page spec and 80kloc reference implementation
- Eighteen different libraries wrap or reimplement the reference implementation
- Someone gets fed up with this nonsense and converts their app to save their data in a new simple text format.
- The circle of life continues.
I love this idea and wish json had comments, too, but if you start hitting the point where JSON is not expressive or fluid enough, that's a hint that it's probably not the right thing for what you're doing. This variant puts a lot of work into human-friendly json, but if you're doing a lot of hand-editing of a file, it should probably not be JSON.
And now you can't roundtrip the comments if for some reason your JSON parser needs to change something.
If you showed JSON to someone on the street they could probably understand the gist of it (if pretty printed). Good luck asking them to write it.
But I'm a scala developer so I might be biased.
Because of this eventually you will need to convert your HJSON to JSON prior to deploying, and that would make things slower. You will be dealing with 2 formats instead of one.
Then, do you really believe that adding all this syntactic "features" (overhead) will make it less error prone? It will make it more error prone because it has more things to consider!
It's going to be parsed essentially once---startup.
comments are the most important factor of 'human-readability' by far. without them you can't e.g. explain what a particular key does, what is it's default value (if any), or perhaps the most important thing - you can't even put a link to the documation!
It's not meant to transmit context (i.e. it's useless and that's what documentation is for).
For me the biggest problem with JSON is lack of full floating point support, i.e. NaN, +-Infinity, -0.
This seems pedantic, I agree, but thus is the world we live in...
Might I suggest "Human Readable JSON."
In short: a joke.
I'm quite happy using a preprocessor like [0], which keeps the great simplicity of JSON and just allows comments.
I don't, actually, I use preprocessors too¹, but since they're not always an option, I'd rather recommend yaml than have yet another pointless config file language needlessly fragment the market.
¹: Thankfully it's rather trivial in python: https://github.com/creshal/yspave/blob/master/yspave/pave.py...
Take YAML, it looks pretty natural at first sight, but has a virtually infinite list of gotchas.
That's why I wrote this:
https://github.com/crdoconnor/dumbyaml
YAML is far better with explicit typing and flow style, tag tokens and node anchors/references removed.
jsonnet gives all the benefits of hjson - but also provides more powerful templating features
There are not many secure transportation formats, JSON being one of the few. And JSON can be parsed much faster and easier than YAML, with its types, cyclic references and classes.
YMMV, but if you're aiming for a format that's edited / maintained by humans things like YAML's anchors and substitution are exactly the features I'd want...
The only reason JSON is popular is because of Javascript. And the only reason Javascript is popular is because of the browsers and their history.
i.e. something with aspects of clang-format (which tries hard not to change the meaning of your code even if it's broken), and the aggressive autocorrection necessary to make typing on a touchscreen work?
I suppose there are converters from this to json, though, so maybe this is just a better specified way of converting keypresses from monkeys into something with well defined structure...
This parser handles YAML, JSON and XML. Interestingly, many of the features HJSON has, this has, by virtue of it being easier to implement during the parsing stage.
The part I'd draw your attention to - and the part that I think warrants the most discussion - is the resulting data structure. I mostly can't tell what the structure is of the HJSON C# object - it looks like it does most of what I wanted to change about the existing C# JSON parsers, but maybe not all?
This can't even be parsed natively by major JavaScript implementations, so is it really JSON at all? Actually, I think that's the root of my complaints, that it's associating itself with JSON while clearly diverging from what was important originally in JSON. At this point it's just some incompatible format leveraging the JSON name. I think most my criticisms would be ameliorated if it was just some other JSON-similar format with a different name.
We've been looking for a replacement configuration format over our ancient ini files and had rejected JSON for TOML because TOML allows comments (and man, can comments be useful in configuration files). This looks like a nice medium-long term alternative.
It is meant to be generated by a machine and not created by hand, neither it should be readable by humans, only parse-able by a computer.
Treating your data interchange/serialization/configuration/markup formats as languages that should be human readable/writable is a cardinal sin of any person or company that engages in such practices.
In my eyes pretty much the perfect configuration library and syntax. Nginx-alike, number suffixes (1min, 2gb, ..), macros, variables, includes with priority, etc. Boom! Problem solved.
...except you just made it significant.
This is JSON.
{"a":1,"b":2,"c":3}
This is Hjson after applying the mods: {a:1b:2c:3}
....oh, it turns out Hjson actually does have significant whitespace. YAML expresses structure through whitespace. Significant
whitespace is a common source of mistakes that we
shouldn't have to deal with.
since every code editor ever used will take care of this for you.Help me with my list of trendy things that took or are taking way to long to get a grammar: docopt, semver...
YAML references also proved useful in my use case.
abc "abc" abc, "abc",
How does increasing the scope simplify things? Defining correct as "crashing less often" is a really bad idea, data formats _should_ be strict.
This will end in tears.
> , less mistakes,
Does someone have a rather subtle sense of humour, or is that a genuine mistake?! (fewer*)