It is plain wrong to make a standard easier for machine-parsing at the expense of humans who are typing it in.
EDIT: Another example. I write some HTML in a text-editor/textarea and send it across to someone. If I missed a </LI>, should the parser reject it? If not, the standard should be accommodating enough so that this is valid.
But the problem is that you now have a specific behaviour that depends on the tag name. DIV tags don't need to close before the next DIV, but LI tags do. So you've gone from a simple tag parser to one that needs to know the intricacies and rules surrounding every element type.
Personally, I feel that two extra characters per <BR/> tag is worth it.
Nowadays I just close everything because my OCD outweighs my laziness.
Given the landscape of HTML (lots of producers, comparatively few parsers), this shift seems reasonable to me.
Humans throw some HTML-like stuff at the browser and the browser tries hard to make sense out of it. If the browser misinterprets, you have a hard time finding out what went wrong.
Whereas with XML and XHTML you get told immediately what's wrong and you don't have to hope that every browser implementation works the same way.
It's also a bit strange to argue about "easier to write manually" in this day and age of Markdown, HAML, etc.
This is what HTML validators are for.
Browsers should do their best to interpret the page authors intention and actually display a page. The developer doesn't always have 100% control over the page markup (think about user generated content, ads etc...)
Humans throw some HTML-like stuff at the browser and the browser tries hard to make sense out of it. If the browser misinterprets, you have a hard time finding out what went wrong.
I haven't seen any modern browser misinterpret HTML's (simple) closing rules. As for being harder to debug, I haven't seen any real evidence of that either.
Whereas with XML and XHTML you get told immediately what's wrong and you don't have to hope that every browser implementation works the same way.
Again, do you have any evidence of such incompatibility with current or recent browsers?
It's also a bit strange to argue about "easier to write manually" in this day and age of Markdown, HAML, etc.
Way more HTML is written by hand than Markdown and HAML. The issue isn't just saving keystrokes. The point is that whenever possible, technology should accommodate simple mistakes people make.
But, by that logic, isn't it also strange to argue about making HTML's syntax more consistent if we should be using Markdown/HAML/etc to generate it anyway?
BTW, i do agree with you in that having a more consistent syntax is better than having a syntax that aims to save a few keystrokes at the expense of adding special rules. As a user, i find it more difficult to have to remember which cases are special than to read or write a more consistent syntax. I just don't see how your comment on Markdown/HAML helps the case for a simpler HTML syntax ;)
But in the end the output is HTML and having a consistent syntax makes it easier to generate, read, and debug it.
Its syntax doesn't need to be dumbed down for casual users because casual users have other options. In that sense I think my comment is in support for a better HTML syntax. :)
I don't think any of the things you mention actually make html considerably more legible or easier to write for a person. Just harder to parse for a machine.
I would rather have a strict language and have solid parsers that can thoroughly and decisively reject improper markup and help stop people from making mistakes while writing markup. Rather than trying to interpret what they really meant after the fact.
The main push behind stricter document control is making it easier to make all the browsers render documents consistently.
Also, in the age of XHTML, manual document editing was seen as dead - tools like DreamWeaver were popular, and XSLT was touted as the answer to server-side templates. The web refuses to be anything but a pile of dirty hacks upon dirty hacks though, which while frustrating, may have a hand in it's popularity :)
That is, every browser whose engine has been updated for HTML5 and which also implements the parser specification correctly and without bugs. Which is most, but not all, of them.
My personal preference is to include the optional closings, because XHTML has been around a lot longer and therefore a larger proportion of browsers have been coded to handle it properly and have had more time to work out bugs.
I do like that HTML5 browsers can work around invalid markup in a well-specified way, which is much better than XHTML browsers just showing an error. It's the best of both worlds, especially when the browser's Developer Tools give you warnings about the invalid markup too so you don't need to use an external validator to find them.
It is far worse to sacrifice consistency of the mental model behind the language for the sake of not typing two extra characters.
Core XML is much easier to read than modern HTML, because you can read it without knowing the context of what you're looking at and without memorizing tons of exceptions. It's easier to parse for the same reason.
Also, the savings on avoiding " /" are offset by the need to needlessly close some of the HTML5 tags.
The only really stupid things in XML I remember is the need to do checked="checked" and people using namespace prefixes on every tag. It's pretty obvious how to fix the former. The latter is entirely avoidable if you have a fully working parser.
(ul (li "one") (li "two"))
Although I hadn't heard of the SHORTAG feature before: <strong>Hell yea</>
It's a slightly more verbose version of s-expression.No, it isn't - especially not when the intended use case is for literally every viewer of the document to use "machine-parsing" to read it, doubly so when a significant fraction of the users will actually BE machines...
Unless you have nested list.