Douglas: The first time I saw JavaScript when it was first announced in 1995, I thought it was the stupidest thing I’d ever seen. And partly why I thought that was because they were lying about what it was.
A bigger more interesting thing though is how his company failed, in part, because they used hand-rolled JSON for messaging. Douglas: And some of our customers were confused and said, “Well, where’s the enormous tool stack that you need in order to manage all of that?”
“There isn’t one, because it’s not necessary”, and they just could not understand that. They assumed there wasn’t one because we hadn’t gotten around to writing it. They couldn’t accept that it wasn’t necessary.
Adam: It’s like you had an electric car and they were like, “Well, where do we put the gas in?”
Douglas: It was very much like that, very much like that. There were some people who said, “Oh, we just committed to XML, sorry, we can’t do anything that isn’t XML.”
I started my career during peak XML crazy and while I liked parts of it at the time, the number of things it was used for was quite insane. I had to maintain a system once where a major part of it was XSLT, when could have just been a simple imperative algo with some config settings.Anyhow, hope you like the episode!
Every time the topic comes up I feel the need to say that I loved XSLT. It was so nice. XML frankly was kind of simple, too. It had elements and attributes and that was it. And it had xpath, which offered, among other things, a parent axis, so you could walk the node tree upwards.
In JSON you can't get to the parent from the child. And walking down a tree is unintuitive, because nodes can be of different types, and if you want to maintain the order, or use successive instances of the same things (that would have the same name) you need to use arrays, and arrays of arrays of arrays look bad. Schemas are an afterthought.
JavaScript is cool -- it has mostly eaten the world anyway. But JSON is not so good IMHO.
XSLT was (and still is) great for transforming documents. Want that recipe collection as HTML? Easy.
- If you are describing hierarchal data, JSON is great
- If you are describing text with markup, especially extensible markup, for machine generation and consumption, XML is great.
- If you are describing a graph, neither have broadly accepted standards so you are kinda on your own.
Depending on your requirements, a recipe collection might be better in XML or in a flavor of markdown. A comprehensive data schema and software support for recipes could be challenging/limiting, compared to marked-up text.
You can still do XSLT in the browser. You can serve arbitrary XML and transform it. As an example, Atom feeds on my website (such as <https://chrismorgan.info/blog/tags/meta/feed.xml>) render just fine in all mainstream browsers, thanks to this processing instruction at the start of the file:
<?xml-stylesheet type="text/xsl" href="/atom.xsl"?>
But working with it is not particularly fun, because XML support in browsers has been only minimally maintained for the last twenty or so years. Error handling is atrocious (e.g. largely not giving you any stack trace or equivalent, or emitting errors only to stdout), documentation is lousy, some features you’d have expected from what the specs say are simply unsupported (and not consistently across engines), and there are behavioural bugs all over the place, e.g. in Firefox loading any of my feeds that also fetch resources from other origins will occasionally just hang, and you’ll have to reload the page to get it to render; and if you reload the page, you’ll have to close and reopen the dev tools for them to continue working.JSON only competes with XML. XSLT, XPath, and XSD are just as much an afterthought in that they are completely separate from XML and are entirely optional. The engines written around those is where the powers to walk the tree and validate come from, not XML itself. There's a wide range of tools to get the same benefits for JSON sources, and they usually handle XML and other data sources too, because it shouldn't matter. The reason the X* tools have fallen out of favor is because they're unnecessarily tied to a single type of source data.
Same here. XML was going to save the world! Remember XML data islands with data embedded in page source and displayed via XSLT?
The craziest thing I had to build was a tool to manage the dozens to hundreds of XML configuration files that powered our product. The tool allowed editing and deploying the files, complete with validation and even input suggestion based on associated XSD for each XML file.
I was sad to hear that Crockford is not aiming to be the author of "the next language" anymore, but I wonder how sincere that really is. His thoughts on actor-based languages are interesting.
Crockford's thoughts on actors are really interesting. I tried to pull them apart but I didn't get very far and ended up not including them in the episode.
What he is envisioning is not exactly like Erlang but not exactly like Scheme. He said that Carl Hewitt had a lot of ideas and they were hard to unpack.
If you're interested though, I would reach out to him. He is very approachable and excited to talk to people with ideas for new ways of making things simple.
The closest thing we have right now I think is Spritely Goblins, though that is Scheme. (Not coincidentally, one of the other Electric Communities co-founders is also a Spritely Institute co-founder: https://spritely.institute/about/)
More innocent times.
Apparently Philip Wadler was the person who told them needed it, because the future was XML.
( Walder is big Haskell/PL person)
Around that time it was pretty nice passing around XML, as I was forced to work with VB.Net which also had an XML literal syntax on the backend and Flash/AS3 on the UI.
I had built a POC with E4X that was VERY similar to React/Redux over a decade before React, but the other browser vendors didn't have it... At the time IE and Chrome were shifting towards JSON.
Not so with XML: all the parsers were insanely complex with the namespacing and whatnot feature support and possible external URLs and everything else...and as a result however no XML library was ever adequate to interface with anything. On multiple occasions generally the best way to build XML for something was to take a working copy, and then glue text together so you would exactly replicate whatever that specific application wanted, rather then trying to use anyone's library for it.
> Like with the original J2EE spec, which sought to complicate the basic mechanics of connecting databases via HTML to the internet, this new avalanche of specifications under the WS-* umbrella sought to complicate the basic mechanics of making applications talk to each other over the internet. With such riveting names as WS-SecurityPolicy, WS-Trust, WS-Federation, WS-SecureConversation, and on and on ad nauseam, this monstrosity of complexity mushroomed into a cloud of impenetrable specifications in no time. All seemingly written by and for the same holders of those advanced degrees in enterprisey gibberish.
https://world.hey.com/dhh/they-re-rebuilding-the-death-star-...
It sounds a bit like someone paved a garden path for you by that point. One of the reasons for the "enormous tool stack" wasn't just depth of tools needed ("tool X feeds tool Y which needs tool Z to process namespace A, but tool B to process namespace C, …"), but also the breadth. I recall there were at least six types of parsers to choose from with all sorts of trade-offs in memory utilization, speed, programming API: a complicated spectrum from forward-only parsers that read a node at a time very quickly but had the memory of a goldfish through to HTML DOM-like parsers that would slowly read an entire XML document all at once and take up a huge amount of memory for their XML DOM but you could query through the DOM beautifully and succinctly. (ETA: Plus or minus if you needed XSD validation at parsing time, and if you wanted the type hints from XSD to build type-safe DOMs, etc.)
A lot of XML history was standards proliferation in the xkcd 927 way: https://xkcd.com/927/
XPath tried to unify a lot of mini-DSLs defined for different DOM-style XML parsers.
XSLT tried to unify a bunch of XML transformation/ETL DSLs.
The things XPath and XSLT were designed to replace lingered for a while after those standards were accepted.
Eventually quite a few garden paths were paved from best practices and accepted "best recommended" standards and greenfield projects start to look easy and a simple number of well-coordinated tools. But do enough legacy Enterprise work and you can find all sorts of wild, brownfield gardens full of multiple competing XML parsers using all sorts of slightly different navigation and transformation tools.
However I think by now we've seen that a lot of that "unnecessary" XML complexity was not, in fact, entirely unnecessary. These days we use JSON for everything, but now we've got JSON Schema, Swagger/OpenAPI, Zod, etc etc. It's not really simpler and there's a lot of manual work - we might as well be using XML, XSD & SOAP/WSDL.
It wasn't until about a decade later when I finally got to use XML "for real". At my academic publishing job. One of my first real projects was having a set of academics analyze documents in a web application I built. Prior to that they were analyzing them by hand, were converted to SGML somewhere in Korea, and we would use omnimark to move them to XML and eventually a library application.
The XML community, the one's who haven't retired or passed on, have been more welcoming of the competition too. They went from XML is everywhere, to being able to return JSON from an XSLT. I am in a small shop, and so I wear many hats. But I am always satisfied when I get to work with XML, or craft an xsl/xq script that does exactly what I need. Additionally, the community as a whole is very helpful, and a bit more grey. Meaning, they are less likely to fall for trends and bullshit.
A bit disjointed, but ,in short, XML is awesome. Now only if they would move Balisage back to Montreal. I'm no fan of DC or virtual conferences.
Such a document is essentially as simple as the equivalent JSON.
Writing a conformant XML parser is a HUGE undertaking comparison.
I could get most places to give me the time to write a JSON parser in whatever language of it didn’t have one. I couldn’t do that with XML.
Because of this, every common language (and most uncommon ones) has a JSON parser while XML parsers are less common (and fully conformant ones are even more rare).
As a human in a repl, I appreciate the balance of readibility between XML which uses a larger set of syntactical characters, and YAML which uses fewer.
I also appreciate JSON's ontological simplicity over XML. This primarily boils down to the lack of attribute nodes and explicit difference between objects (lists of key-values) and arrays (lists of values).
Very well put. And we could lower the baseline substantially towards simplicity, even from JSON.
It's pretty clear that a lot of people think this way. Some even seriously try to figure out what such a baseline of simplicity would look like.
There are lots of simple indentation-based designs (similar to YAML) such as NestedText[0], Tree Notation[1], StrictYAML[2], or even @Kuyawa's Dixy[3] linked in this thread.
There seem to be less new ideas based around nested brackets, the way S-expressions are. Over the years, I have developed a few in this space, most notably Jevko[4]. If there ever will be another lowering of the simplicity baseline, I believe something like Jevko is the most sensible next step.
[0] https://nestedtext.org/en/stable/ [1] https://treenotation.org/ [2] https://hitchdev.com/strictyaml/ [3] https://news.ycombinator.com/item?id=35469643 [4] https://jevko.org/
All the optional complexity that can go on top, though, is probably better specified for XML. Transformation is well defined for XML (XSLT) but not at all for JSON (I guess, you write your own code to manipulate native objects).
Schemas are basically a native feature for XML. Not so much for JSON.
All sorts of specialised vocabularies are defined for XML. A few are defined for JSON, too.
At first XML namespacing sounds simple. Each tag and attribute will have an optional uri attached to it, no big deal right?
From reading through the specification one could be forgiven from assuming that the prefixes are just arbitrary mappings that a processor can ignore, or automatically remap to alternate prefixes.
For example, it is true that <abc:a xmlns:abc="https://example.com/xyz" xmlns:def="https://example.com/xyz"><def:b>5</def:b></abc:a> (notice both namespaces are the same url) is equivalent to: <a xmlns="https://example.com/xyz"><b>5</b></a>.
Unfortunately, the data model also allows for content to reference the namespaces by prefix, and therefore every general xml processor that supports namespaces must keep around an application accessible mapping from the prefixes to namespaces, as the application may need to be able to access that information to interpret attributes or content. The only exception to this would be if the general XML processor insisted on having schema information for every namespace it might come across. In that scenario it would be able to tell if an attribute value of "abc:b" is really a string literal, or a reference to a namespace identifier (QNAME data type), where the namespace is whatever the current "abc" prefix is bound to, and the identifier is "b".
But obviously we don't want to add full schema support for a simple implementation, so we need to keep the mapping information around, just in case the application needs it. We also cannot easily offer nice features like changing a document to use preferred prefixes for certain namespaces, unless we also keep any prefixes that are used in values that could be interpreted as QNAMES, just in case they actually are, but our processor does not know, because it has omitted schema support for simplicity (or perhaps it included schema support, but does not have a schema available for some namespace).
And that is just the complexity that stems from one fairly small quirk in how XML works.
You also have no idea if an element content needs to preserve whitespace or not if you don't know the schema, and don't happen to have an xml:whitespace attribute present. Thus if you want to re-indent arbitrary xml for readability safely you could end up with something like this:
<abc
><def
>5</def
></abc
>[0]: https://dev-docs.kicad.org/en/file-formats/sexpr-intro/
ie, the benefits of simplicity have a limit.
JSON is so much more ergonomic than XML as the lingua franca because I can actually read it. That being said I still have my share of problems with JSON.
Me? Schemas are a requirement in areas where you need to integrate over different technology / with different implementations. JSON Schema is in those contexts a bit of a kids toy compared to what XML can do.
We’re not using anything else from Prisma, but if we had to implement something else in JS to talk to a database, that would be a contender for our database interface layer (there are only a couple of others that are even remotely usable, having suffered through the disaster of a Sequelize implementation). We’re more likely to use Elixir and Ecto.
I don't know that I can lay the blame on either one of them directly, mind. But the industry definitely suffered from the bad faith cooperation of those companies.
First, it really depends what you're deserializing with. There is a lot of code out there that just does JSON.parse and then starts accessing the data and then you have an "undefined" get passed deep into the call stack where maybe it explodes or maybe the program just misbehaves. So if you're using a language like JavaScript or Python, then a JSON schema can be used to validate input right away. Think of it like enforcing a pre-condition.
It's also useful in cases where JSON is being used for configuration files. At my company we have quite a few places where JSON files checked-in to a git repo are our source-of-truth which then get POST'ed to an API. We can enforce the schema of those files using pre-commit hooks so no one even wastes time opening a PR that will fail to POST to the API. The same JSON schema is also used by the API to ensure the POST'ed data is correct.
I disagree, this example is just sloppy programming. Passing unvalidated data deep into a program is bad, I'm not arguing for that. What I'm saying is that you should be converting your unvalidated serialized data into a structured type right on the edge. Your data type/type system should __be__ your schema/validator.
> So if you're using a language like JavaScript or Python, then a JSON schema can be used to validate input right away. Think of it like enforcing a pre-condition.
This is what I do with python+pydantic:
@dataclass
class Foo:
bar: int
foo = Foo(**json.loads(json_buff))
I'm not the biggest fan of pydantic here because you'll have to handle an exception for invalid data instead of an Option or Result in a better type system. But w/e.> It's also useful in cases where JSON is being used for configuration files. At my company we have quite a few places where JSON files checked-in to a git repo are our source-of-truth which then get POST'ed to an API. We can enforce the schema of those files using pre-commit hooks so no one even wastes time opening a PR that will fail to POST to the API. The same JSON schema is also used by the API to ensure the POST'ed data is correct.
You can easily do with serdes and a type library as well.
---
I guess schemas may be useful for crossing language boundaries, but you're going to need language specific types/objects at some point so why use schemas directly even then? (I think gRPC may have code gen tools for this purpose).
{
"someSetting": true
"comment": "TODO change to false when ready"
}
Though really text-based protobufs are better for config.In reality people insert those meta-processing instructions in other ways.
But you still should have the option to at least ignore them while reading. That would make JSON config files so much better to work with.
It is simpler than XML/XSD. Without the schema, you never know if a certain element should be treated as being part of a list or not. When interoperating with anything other than XML, that matters.
I can remember hardcoding and manipulating a bunch of non-sense legacy fields just to get a ticket created via their SOAP enterprise service bus. Not to mention all the operations that made no clear sense.
Consuming SOAP/WSDL from languages other than the one it's published in isn't fun. Man, some of the PHP implementations were beyond horrible... well defined REST/RPC +_JSON is generally much easier in the end.
I disagree. I think personal hygiene is very important for in-office coworking.
Well, I'm about to take a shower now, and shame on you.
- generic concepts like arrays and maps
- lack of opportunity to invent names
Every xml schema is a potential DSL that reinvents things they might now.Other than that it's true that the xml era was just addressing a lot of important stuff early, I guess it was only compatible with big corp mindset and not early web dynamic / fluid / small scale apps. (a bit like how PHP started to write PSR to avoir dynamic code / effects in libs .. formalization etc.
For this JSON:
{
"part_numbers": [1, 2, 3, 4, 5]
}
You have two main ways to represent these in XML: <!-- repetition = array -->
<order>
<part_number>1</part_number>
<part_number>2</part_number>
<part_number>3</part_number>
<part_number>4</part_number>
<part_number>5</part_number>
</order>
<!-- wrapped repetition -->
<order>
<part_numbers>
<part_number>1</part_number>
<part_number>2</part_number>
<part_number>3</part_number>
<part_number>4</part_number>
<part_number>5</part_number>
</part_numbers>
</order>
Is this better than JSON? No, not particularly. But it’s no less clear than the JSON, and it compresses pretty well (it compresses better for larger documents, obviously).The larger problem with XML is that the tooling is often lacking outside of Java and C#/.NET and none of the tooling is well-built for the sort of streaming manipulation that `jq` does (it exists, but IMO one of the least usable ideas from the XML camp is XSLT), and JSON support is pretty universal everywhere, even if the advanced things like JSONpath and JSON Schema aren’t.
I also think that there’s a problem when you have to choose between SAX and DOM parsing early in your process. Most JSON usage is the equivalent of using a DOM parser because the objects are expected to be relatively small, but many XML systems are built for much larger documents, and therefore need to parse the stream because the memory use otherwise would be unacceptable. The use of a JSON streaming parser is much rarer, IME.
The hate I have for XML is the high markup overhead. Anybody who has configured a trunk of the century product with XML config files knows what I mean; the screen is usually 2/3 XML tags, which means 1/3 closing tags, which add nothing semantically
Uh... do we? I've never used any of those. Plain JSON has always worked fine for me.
You don't have to use any of those.
> The good thing about reinventing the wheel is that you can get a round one.
I mean... with charity I can see the context and get it. But. What!?
Overall fun read through history, even if definitely from Doug's perspective only. (As evidence by JavaScript being an originator of lambdas...) I do find the idea that JSON was as novel as history says it was kind of odd. I remember inlining javascript objects years before "JSON" was a thing. Making it a subset of what javascript could already do seems straight forward and a good execution. Getting rid of comments feels asinine to me. (I'll also note that the plethora of behaviors you get from JSON parsers shows that it is effectively CSV. Sure, there may be a "standard" out there, but by and large it is a duck typed one.)
I'm also a bit on the camp that XML is better than JSON. Being able to have better datatypes, for a start. Schemas that allow autocompletion. Is also easier to see as a markup language (per the name). That said, they clearly went too far with entities and despite making sense for markup, attributes versus children are more than a touch awkward.
I also recall that what killed XML and WSDL files in general, was the complete shit show that was getting a single document to work with both MS and non-MS clients.
And I don't make any real defense of some of the darker corners of XML. In particular, I already criticized entities being a bit too much. Namespaces are also something that, while I can see the desire, the implementation is way too much for most of us.
JSON schema is going to be cursed for a long time. Just the odd treatment of it will be a problem. (In particular, that it is a subset of the numbers that javascript itself supports is... awkward.)
I also confess, though; that I'm not clear why I would want a null in the middle of a string? That feels like a gun loaded and aimed squarely at a foot.
>The best thing we can do today to JavaScript is to retire it. Twenty years ago, I was one of the few advocates for JavaScript. Its cobbling together of nested functions and dynamic objects was brilliant. I spent a decade trying to correct its flaws. I had a minor success with ES5. But since then, there has been strong interest in further bloating the language instead of making it better. So JavaScript, like the other dinosaur languages, has become a barrier to progress. We should be focused on the next language, which should look more like E than like JavaScript.
- https://evrone.com/douglas-crockford-interview
One of the traits that makes Douglas great is being willing to say the obvious even if it is politically unpopular.
E had some really cool ideas, it's sad that it doesn't seem to be that well known!
1. You've got to keep JS around for backwards compatibility for the billions of websites already using it.
2. You will need to two engine teams, one to maintain JS and one for the new language.
3. Now you have a whole new vector for security issues. You've made the threat surface much broader. So, you will probably need to hire additional people.
4. You need to coordinate with all the other browser makers so everyone rolls out their new engines more or less concurrently. Other than experiments, nobody is going to start using it unless it works on all the major browsers and platforms.
If we went to a scheme dialect as originally intended, we could have just ONE language for all the things.
Legacy JS? Just compile it into Scheme and run it.
HTML? Use S-expressions and support legacy HTML syntax by compiling it into them. Now you get all the power people want from template languages, but baked right into main language itself.
CSS? No more weirdness like adding sin() or calc() to make up for shortcomings. Once again, you get the power of the full Scheme language right there.
What makes XML so unergonomic to ingest is 1) attributes, which don't map cleanly to a basic data structure that you might find in a programming language, and 2) namespaces, which are extremely, extremely tedious to program against.
Programmers are going to use the format that's the easiest to ingest and manipulate. JSON wins in that regard, hands down. Every time I need to write logic to ingest a namespaced XML document I heave a deep sigh and brace myself for another long week of fighting with LXML. But with JSON it's as easy as `json_decode($str)` and move on with your life.
Abandoning XML was the webs biggest mistake.
Very unfortunately for everyone XML came up at the same time as peak "Enterprise" moat building. No design pattern went unused everything was built with mind numbing "configuration". XML got used heavily in that space because it allowed massive "Enterprise Objects" (local branding varies) to be serialized in a way another system might have a chance to read.
Meanwhile the features you mention got thrown out with the bath water because everyone hated Enterprise style architectures. While I don't love, for instance, everything about XSLT it's built directly into browsers as native code. How many person hours, megabytes of JavaScript, and wasted CPU cycles have been spent reinventing client side templating using JSON? XSLT is already right there and will happily convert serialized data to your presentation format. You also get the ability to have comments in the data and a built in schema validation.
On my current project I'd much weather be emitting and consuming XML rather than JSON. But alas everyone hated Enterprise XML so we're stuck with JSON and the inability of some parsers to handle trailing commas and ambiguous definitions of numerics and not a comment to be found.
Have we though? Earlier, the article even has Douglas saying:
> It turns out it, well, it’s a multi paradigm language, but the important paradigm that it had was functional. We still haven’t, as an industry, caught up to functional programming yet. We’re slowly approaching it, but there is a lot of value there that we haven’t picked up yet.
I do love the very ending:
Adam: What do you think is the XML of today?
Douglas: I don’t know. It’s probably the JavaScript frameworks.
They have gotten so big and so weird. People seem to love them. I don’t understand why.
For a long time I was a big advocate of using some kind of JavaScript library, because the browsers were so unreliable, and the web interfaces were so incompetent, and make someone else do that work for you. But since then, the browsers have actually gotten pretty good. The web standards thing have finally worked, and the web API is stable pretty much. Some of it’s still pretty stupid, but it works and it’s reliable.
And so, when I’m writing interactive stuff in browsers now, I’m just using plain old JavaScript. I’m not using any kind of library, and it’s working for me.
And I think it could work for everybody.
------
Earlier in the interview where they were talking about how people behind XML and SOAP wanted complexity and were upset by the simplicity of JSON, I was thinking that this was resonating with me and how I feel about how complex web development has become with babel/webpack, transpiling, react/vue, etc. It feels like complexity for complexities sake.
If only this were true.
https://medium.com/r3d-buck3t/insecure-deserialization-with-...
On the other way, XML External Entity is a part of XML standard, so any standard compliant XML implementation have to support it. This is why XXE attack applies to many languages.
JSON is simpler and easier for many cases, but then you lose the interoperability. Go try to make an app right now dealing with Federal government systems or finance, you're going to end up translating JSON<->XML which isn't fun.
There's not going to be a silver bullet solution to this problem, it's not completely solvable.
Not fun? It's not even possible in the general sense.
If you have XML that looks like:
<meal type="breakfast">
<eggs count="3">
<topping>cheese</topping>
</eggs>
</meal>
How would you convert that to JSON without knowing how the JSON consuming application expects it to be formatted? Where do you put the "breakfast" and "count" attributes?You'd need to manually write a translator for each potential translation.
Yep, therein lies the “not fun”. You write a bunch of super complex, brittle code.
Unfortunately because XML is entrenched in certain domains, you have to decide between writing these converters or doing everything in XML which also sucks, especially if you’re trying to write a modern app with a modern stack.
I'm leaving it here because it will never be used for anything but at least it may inspire somebody design a better format with simplicity in mind
Other problems to ponder: Is 0 different from 00? Is "1, 2, 3, 4" different from "1,2,3,4"? Is "a: b" different from "a : b" and "a:b"?
It's like the man never tried. Try a Java enabled browser: https://www.wikihow.com/Enable-Java-in-Firefox
Just as a reminder Minecraft (the most sold game in history) started out as an Applet.
Applets where not horrible because of the underlying technology, they where horrible because people made bad things with it, just like J2EE was a bad thing people made with J2SE.
But sometimes, rarely, people would make beautiful things with J2SE and J2ME and those are now removed from history forever under the banner of security like everything else that is good in life.
> Douglas: For me, the most difficult thing was raising money. You’re constantly going to Sandhill and calling on people who don’t understand what you’re doing, and are looking to take advantage of you if you can, and they’re going to do that, but you have to go on your knees anyway.
> I found that stuff to be really hard, although some of them I really liked. And sometimes I’d be sitting in those meetings and I’d be thinking, “I wish I was rich enough to sit on the other side of the table, because what they’re doing right now looks like a lot more fun than what I’m doing right now.” And it was even more difficult raising money then, because at this point, the.com bubble had popped and all VCs had been hurt really badly by that. So they were only funding sure things at that time, in late 2001, early 2002.
> And I thought we were a fairly sure thing, because we had already implemented our technology. And by this point, Chip and I understood the problem really well. And we had a new server and JavaScript libraries done in just a few months. And we had demonstrations. We could show the actual stuff. So it wasn’t like we were raising money so that we could do a thing. We had already done the thing, we needed the money so that we could roll it out. And that wasn’t enough for them. They wanted to see that we were already successfully selling it. And I was like, “If we could do that, we wouldn’t need you.”
Only they hadn't. They had built a demo of what we would later call a web 2.0 app. It wasn't even an application that solved a business problem or did anything specific. It was just showing the concept. That's not a product and that's not a business. The VC's point was: Show us proof that this idea has tangible benefits people will pay for.
The biggest misconception of VC's is that you raise money to "successfully sell" something you've built. You don't. You raise VC money to scale something that has value. So you need to communicate the business value, and ideally have proof-points (either in the form of sales, or data) that prove the value.
Of course Douglas found raising money difficult. But he doesn't seem to have the self awareness that this was probably due to him, and not the rich suits on the other side of the table.
1. Parsing JSON doesn't require adding new firewall rules
2. There are no comments, so nobody will try to invent their own meta format or annotations in comments and instead they will put data in the JSON as they should
3. (When compared to JS) someone finally had the balls and picked one type of quotes, this makes making parser so much simpler.
XML supports comments and I have not seen a single use of comment directives in it ever.
I have seen plenty of comment directives in programming languages, HDLs and so on. But they are usually used as hints, e.g. to linters or to control compiler warnings, and they work perfectly well and cause no problems at all in my experience.
You might say that Crockford didn't anticipate JSON being used for config files. Fair enough. But now that it is, it should support comments.
My recommendation is to use JSON5 since it has a distinct file extension and fixes some other things about JSON too (e.g. trailing commas, hex constants) without being full on YAML insane.
It also means it's worse format for configs where you sometimes need to annotate a few nodes with comments.
"comment": gets littered across the JSON... or temporary changes are copied and the original property name is invalidate with a prefix. The simple structure is gone, replaced with adhoc workarounds.
Similarly when you want to use a type not supported by JSON such as datetime or binary data, you might end up with "type":"binary" and use base64 or whatever in the value (shoehorning attribs) - when it really needs a schema to follow during parse and stringify. Or OpenAPI, which is hardly lightweight and really doesn't match the simplicity of JSON.
Local schemas, not crazy remote schemas.
Or some sort of way to bless an "official" schema format.
Even C# just punts on this issue and won't emit valid XML if a string you serialize happens to have a null character in it.
A human won't be able to read it (Unless you're crazy and have learned to read Base64), but the application still can easily. You'll just have to add a Base64 translation step before/after serialization/deserialization.
The other two premier XML use cases I can think of are
1. RSS: Last time I did this, ironically I built the payload with a JSON-API'd lib that deals with the XML drama for me. Worked fine.
2. Configs. Rarely are these done in XML anymore. Human readability matters for configs. But there are also better options than JSON for this.
Then I had to live through the whole SOAP-drama, and Java EE; and ended up promising myself to never touch it again.
It has too many degrees of freedom for its own good, the C++ of data formats.
JSON is in many ways the other end of the spectrum; simple but underspecified and painful to deal with in anything but JS.
I often dream of something in-between.
- This message brought to you by TOML gang
I’ll take edn over any of “em. https://github.com/edn-format/edn
Comments and time stamps allowed, arbitrary nesting of data structures, make your own tagged literals if you need them. And commas are whitespace, mostly unnecessary.
Come join the dark side where we enjoy the wonders of binary formats such as avro and protobuf.
Though for something where you want human readability it's hard to beat TOML in my opinion.
apiVersion = "v1"
current-context = ""
kind = "Config"
[[clusters]]
name = "my-cluster"
[clusters.cluster]
certificate-authority-data = "LS0tL..."
server = "https://example.com"
[[contexts]]
name = "context0"
[contexts.context]
cluster = "my-cluster"
user = "my-user"
[[contexts]]
name = "context1"
[contexts.context]
cluster = "my-cluster"
user = "my-user"
[[users]]
name = "my-user"
[users.user]
[users.user.exec]
apiVersion = "client.authentication.k8s.io/v1beta1"
args = ["eks", "get-token"]
command = "aws"At least use a native toml file as an example.
Also if I was handwriting that I would probably make more use of doted property names implying dictionaries like so, which though it has a little bit more repetition in property names, seems easier to read:
apiVersion = "v1"
current-context = ""
kind = "Config"
[[clusters]]
name = "my-cluster"
cluster.certificate-authority-data = "LS0tL..."
cluster.server = "https://example.com"
[[contexts]]
name = "context0"
context.cluster = "my-cluster"
context.user = "my-user"
[[contexts]]
name = "context1"
context.cluster = "my-cluster"
context.user = "my-user"
[[users]]
name = "my-user"
user.exec.apiVersion = "client.authentication.k8s.io/v1beta1"
user.exec.args = ["eks", "get-token"]
user.exec.command = "aws"
If k8s was designed with TOML in mind, it probably have been structured differently, such that "Contexts" for example might be just a dictionary mapping names to an object that has the values from the "context" property (The existing pattern of an array of objects where each object has a name, but store most of their properties in a property whose name matches the object's type is already weird, but doesn't look terrible in yaml.)Such a redesigned to be a more TOML friendly schema would then look like this:
apiVersion = "v1"
current-context = ""
kind = "Config"
[clusters.my-cluster]
certificate-authority-data = "LS0tL..."
server = "https://example.com"
[contexts.context0]
cluster = "my-cluster"
user = "my-user"
[contexts.context1]
cluster = "my-cluster"
user = "my-user"
[users.my-user.exec]
apiVersion = "client.authentication.k8s.io/v1beta1"
args = ["eks", "get-token"]
command = "aws"Somebody should add a json entry to "the ascent of ward" [0]. Of course, it will be longer than all the previous versions combined, and the fields will appear in random order because dictionary.
Choose the right tool for the job at hand. Sometimes json is the right choice, sometimes xml is. Not everything is a webapp.
Are you saying you think JSON shouldn't exist and everyone should use XML for everything?
Tooling around XML was certainly more established, but man there was a lot of complexity built up around it.
I use both extensively, and for bigger objects and definitions, XML is a very clear winner.
I'm a big believer in horses for courses type of approach, and my personal gripe is the push to replace one thing with another. These data types can coexist, and can be used where they shine. XML can be read and written stupidly fast, so it's way better as a on disk file format if people gonna touch that file.
YAML and JSON are not the best fit for configuration files. JSON is good as an on-disk serialization format if humans not gonna touch that. XML is the best format for carrying complex and big data around. TOML is the best format for human readable, human editable config files.
It's still a text bound serialization format, you still have to parse a tree for it.
Is it just particularly mature libraries?
What broke me were: plain string and empty node handling.
Here is a fun quiz. Which of these two documents or both or neither are valid. With explanation ofc.
Yaml#1
:
Yaml#2 : In 1996 I was at some of the initial XML meetings.
The participants� anger at HTML for �corrupting�
content with layout was intense. Some of the initial
backers of XML were frustrated SGML folks who wanted
a better cleaner world in which data was pristinely
separated from presentation. In short, they disliked
one of the great success stories of software history,
one that succeeded because of its limitations, not
despite them. I very much doubt that an HTML that had
initially shipped as a clean layered set of content
XML, Layout rules – XSLT, and Formatting- CSS) would
have had anything like the explosive uptake.
https://adambosworth.net/2004/11/18/iscoc04-talk/At this level they are both about equal in complexity: JSON has data types that XML doesn't, and XML has attributes and CDATA that JSON doesn't. JSON syntax is more succinct, but XML syntax is more regular.
Even if XHTML died on the wayside, HTML is imho a stereotypical example where XML is a good fit. Most of the complexity has valid use cases, and it's mostly obvious what should be an attribute and what should be content of the tag. And at least in HTML 4 you even had a doctype tag filling the role of specifying the schema used. Of course SVG is a better showcase for some other aspects of XML, with every editor putting their own metadata in, nicely partitioned into separate namespaces.
I think it's not so much about readability but about complexity. XML is meant to represent complex data, like complex rich text or nested vector graphics. That makes XML complex, conceptually, visually, and in implementation. If you use it to represent something that could have been a csv you're going to have a bad time (as everyone had in the 90s).
Indeed, this was what XML was created for. From W3C's XML specification:
> The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML.
Honestly, what's absurd is GP comment's cluelessness.
Also, it's way better in transferring/storing big, complex intricate data like 3D objects.
Curious how come?
Having the same tags many times means the file can be nicely compressed, it's being XML means it can be verified independently with a schema (and the schema can be defined as a remote location over HTTP if needed be), too.
You can always store data more efficiently with binary formats, but XML DOM parsers allows to access arbitrary parts of the tree instantly, so working with it is both easy and fast at the same time.
> Adam: [...] He also wanted people to use JavaScript properly – use semicolons, use a functional style, don’t use a vowel, use JSLint and so on.
They could have done the same with XML, i.e. define a simple-XML subset without schema, CDATA, entities, etc. Instead they built it on top of another language that is so infamous that they felt the need to write JSLint.
> Adam: The thing they came up with, Doug’s idea for sending JavaScript data back and forth, they didn’t even give it a name. It just seemed like the easiest way to talk between the client side and the backend, a way to skip having to build XML parser in JavaScript.
So the original reason was that they could use eval(jsonstr)? Because of the security implications they better had written a JSON parser. At that point, is it any better than writing a simple-XML parser? At least, that would have saved them from the "it's not a standard" discussions.
Not so different from today. That quote is about HyperCard, not JS, by the way.
The current state of JSON generation/validation is simpler than the XML ecosystem, but a bit hackish.
We can have a much better stack.
Seems politeness goes a long way when you're facing federal charges