undefined | Better HN

0 pointsjeswin12y ago0 comments

I prefer HTML over XHTML, because it is easier to write. I don't get the reasoning behind closing tags. LIs close before the next LI, or the UL. <BR> saves two characters over <BR /> and causes no harm. XHTML feels like trying too hard to make the machine overlord happy.

It is plain wrong to make a standard easier for machine-parsing at the expense of humans who are typing it in.

EDIT: Another example. I write some HTML in a text-editor/textarea and send it across to someone. If I missed a </LI>, should the parser reject it? If not, the standard should be accommodating enough so that this is valid.

0 comments

25 comments · 10 top-level

untog12y ago· 6 in thread

LIs close before the next LI, or the UL.

But the problem is that you now have a specific behaviour that depends on the tag name. DIV tags don't need to close before the next DIV, but LI tags do. So you've gone from a simple tag parser to one that needs to know the intricacies and rules surrounding every element type.

Personally, I feel that two extra characters per <BR/> tag is worth it.

enraged_camel12y ago

In addition, not having to close some tags might make it easier to write HTML, but it makes it more difficult to learn it. I remember back in the day I had to keep looking up which tags need to be closed and which ones do not.

Nowadays I just close everything because my OCD outweighs my laziness.

__david__12y ago

Yes, it makes parsers harder. Or more accurately, it shifts some of the complexity from producers to parsers.

Given the landscape of HTML (lots of producers, comparatively few parsers), this shift seems reasonable to me.

jonhohle12y ago

Wouldn't having a reasonable schema specification specification solve this? Its not like you can invent arbitrary tags (there are arbitrary attributes, but an expressive enough schema language could capture that as well).

callahad12y ago

FWIW, inventing arbitrary tags is coming with Web Components / Custom Element :) http://w3c.github.io/webcomponents/spec/custom/

anonymouz12y ago

But then you need a schema to be able to parse the file. XML files can be checked for well-formedness and parsed without a schema file.

untog12y ago

Sure, but presumably that schema will change with time. Then you'll have a parser built around HTML5, but another new release for HTML6, so on and so forth. It just needlessly complicates things.

bhaak12y ago· 5 in thread

It does not only make the machine overlord happy, it also helps the humans when they do make a mistake.

Humans throw some HTML-like stuff at the browser and the browser tries hard to make sense out of it. If the browser misinterprets, you have a hard time finding out what went wrong.

Whereas with XML and XHTML you get told immediately what's wrong and you don't have to hope that every browser implementation works the same way.

It's also a bit strange to argue about "easier to write manually" in this day and age of Markdown, HAML, etc.

danbee12y ago

> If the browser misinterprets, you have a hard time finding out what went wrong.

This is what HTML validators are for.

Browsers should do their best to interpret the page authors intention and actually display a page. The developer doesn't always have 100% control over the page markup (think about user generated content, ads etc...)

talmand12y ago

Why can't the browser be the validator itself? You load the page and the browser tells you what's wrong with the code.

2 more replies

jeswinOP12y ago

What I'm saying is that it is "not a mistake" to omit the closing tag in some cases. There are things that are hard when it comes to parsing HTML, but when to close open tags is not one of them. Rules for closing tags are trivial to implement and well documented. (Add: also, omitting unnecessary tags such as <html> and <body>)

Humans throw some HTML-like stuff at the browser and the browser tries hard to make sense out of it. If the browser misinterprets, you have a hard time finding out what went wrong.

I haven't seen any modern browser misinterpret HTML's (simple) closing rules. As for being harder to debug, I haven't seen any real evidence of that either.

Whereas with XML and XHTML you get told immediately what's wrong and you don't have to hope that every browser implementation works the same way.

Again, do you have any evidence of such incompatibility with current or recent browsers?

It's also a bit strange to argue about "easier to write manually" in this day and age of Markdown, HAML, etc.

Way more HTML is written by hand than Markdown and HAML. The issue isn't just saving keystrokes. The point is that whenever possible, technology should accommodate simple mistakes people make.

epidemian12y ago

> It's also a bit strange to argue about "easier to write manually" in this day and age of Markdown, HAML, etc.

But, by that logic, isn't it also strange to argue about making HTML's syntax more consistent if we should be using Markdown/HAML/etc to generate it anyway?

BTW, i do agree with you in that having a more consistent syntax is better than having a syntax that aims to save a few keystrokes at the expense of adding special rules. As a user, i find it more difficult to have to remember which cases are special than to read or write a more consistent syntax. I just don't see how your comment on Markdown/HAML helps the case for a simpler HTML syntax ;)

bhaak12y ago

There is nowadays less need for writing HTML by hand. HTML is more often generated from other formats and things like Markdown, HAML, and other lightweight markup languages helped in that.

But in the end the output is HTML and having a consistent syntax makes it easier to generate, read, and debug it.

Its syntax doesn't need to be dumbed down for casual users because casual users have other options. In that sense I think my comment is in support for a better HTML syntax. :)

mkohlmyr12y ago· 2 in thread

Personally I see saving a few characters here and there as a completely inadequate reason for making the spec less consistent.

I don't think any of the things you mention actually make html considerably more legible or easier to write for a person. Just harder to parse for a machine.

I would rather have a strict language and have solid parsers that can thoroughly and decisively reject improper markup and help stop people from making mistakes while writing markup. Rather than trying to interpret what they really meant after the fact.

recursive12y ago

How many HTML parsers do you need to write on average?

mkohlmyr12y ago

You're only addressing half my argument. Just because we can invent the wheel once doesn't mean it's a good idea to make the problem it solves overly complicated for no good reason.

1 more reply

kalleboo12y ago· 2 in thread

The tradeoff is that allowing quirks like that means that either you need to make a massive specification to deal with each way someone might goof in their code (edit: which is what HTML5 tries to do), or you end up with each browser engine reacting differently.

The main push behind stricter document control is making it easier to make all the browsers render documents consistently.

Also, in the age of XHTML, manual document editing was seen as dead - tools like DreamWeaver were popular, and XSLT was touted as the answer to server-side templates. The web refuses to be anything but a pile of dirty hacks upon dirty hacks though, which while frustrating, may have a hand in it's popularity :)

zimbatm12y ago

That's why HTML5 specifies the parser, so that every browser extract the same DOM tree from the same input. The specification is strict in the sense that any parser has to behave the same while also allowing for human error.

DougWebb12y ago

... so that every browser extract the same DOM tree from the same input

That is, every browser whose engine has been updated for HTML5 and which also implements the parser specification correctly and without bugs. Which is most, but not all, of them.

My personal preference is to include the optional closings, because XHTML has been around a lot longer and therefore a larger proportion of browsers have been coded to handle it properly and have had more time to work out bugs.

I do like that HTML5 browsers can work around invalid markup in a well-specified way, which is much better than XHTML browsers just showing an error. It's the best of both worlds, especially when the browser's Developer Tools give you warnings about the invalid markup too so you don't need to use an external validator to find them.

1 more reply

romaniv12y ago

It is plain wrong to make a standard easier for machine-parsing at the expense of humans who are typing it in.

It is far worse to sacrifice consistency of the mental model behind the language for the sake of not typing two extra characters.

Core XML is much easier to read than modern HTML, because you can read it without knowing the context of what you're looking at and without memorizing tons of exceptions. It's easier to parse for the same reason.

Also, the savings on avoiding " /" are offset by the need to needlessly close some of the HTML5 tags.

The only really stupid things in XML I remember is the need to do checked="checked" and people using namespace prefixes on every tag. It's pretty obvious how to fix the former. The latter is entirely avoidable if you have a fully working parser.

mcv12y ago

You've got a system with lots of exceptions and special behaviour. I don't see that as easier for humans at all. (The machine doesn't care; it parses anything you can put in rules. But the more complex the rules are, the harder it is for you to understand the error message. XML is really easier on humans.)

6cxs2hd612y ago

You'd like s-expressions. As in Lisp.

    (ul (li "one") (li "two"))

Although I hadn't heard of the SHORTAG feature before:

    <strong>Hell yea</>

It's a slightly more verbose version of s-expression.

al2o3cr12y ago

"It is plain wrong to make a standard easier for machine-parsing at the expense of humans who are typing it in."

No, it isn't - especially not when the intended use case is for literally every viewer of the document to use "machine-parsing" to read it, doubly so when a significant fraction of the users will actually BE machines...

keeperofdakeys12y ago

The two main advantages were XML parsing performance, and the ability to embed XML directly in the XHTML. For phones of the era, the performance benefits are obvious. As for XML embedding, it'd give you the ability to embed SVG, MathML, and any other XML language directly. This avoids a second retrieval/parsing step, and allows extensibility without changing the XHTML spec.

joesb12y ago

> LIs close before the next LI, or the UL.

Unless you have nested list.

j / k navigate · click thread line to collapse

0 comments

25 comments · 10 top-level

untog12y ago· 6 in thread

LIs close before the next LI, or the UL.

Personally, I feel that two extra characters per <BR/> tag is worth it.

enraged_camel12y ago

Nowadays I just close everything because my OCD outweighs my laziness.

__david__12y ago

Yes, it makes parsers harder. Or more accurately, it shifts some of the complexity from producers to parsers.

Given the landscape of HTML (lots of producers, comparatively few parsers), this shift seems reasonable to me.

jonhohle12y ago

callahad12y ago

FWIW, inventing arbitrary tags is coming with Web Components / Custom Element :) http://w3c.github.io/webcomponents/spec/custom/

anonymouz12y ago

But then you need a schema to be able to parse the file. XML files can be checked for well-formedness and parsed without a schema file.

untog12y ago

Sure, but presumably that schema will change with time. Then you'll have a parser built around HTML5, but another new release for HTML6, so on and so forth. It just needlessly complicates things.

bhaak12y ago· 5 in thread

It does not only make the machine overlord happy, it also helps the humans when they do make a mistake.

Humans throw some HTML-like stuff at the browser and the browser tries hard to make sense out of it. If the browser misinterprets, you have a hard time finding out what went wrong.

Whereas with XML and XHTML you get told immediately what's wrong and you don't have to hope that every browser implementation works the same way.

It's also a bit strange to argue about "easier to write manually" in this day and age of Markdown, HAML, etc.

danbee12y ago

> If the browser misinterprets, you have a hard time finding out what went wrong.

This is what HTML validators are for.

talmand12y ago

Why can't the browser be the validator itself? You load the page and the browser tells you what's wrong with the code.

2 more replies

jeswinOP12y ago

Humans throw some HTML-like stuff at the browser and the browser tries hard to make sense out of it. If the browser misinterprets, you have a hard time finding out what went wrong.

I haven't seen any modern browser misinterpret HTML's (simple) closing rules. As for being harder to debug, I haven't seen any real evidence of that either.

Whereas with XML and XHTML you get told immediately what's wrong and you don't have to hope that every browser implementation works the same way.

Again, do you have any evidence of such incompatibility with current or recent browsers?

It's also a bit strange to argue about "easier to write manually" in this day and age of Markdown, HAML, etc.

Way more HTML is written by hand than Markdown and HAML. The issue isn't just saving keystrokes. The point is that whenever possible, technology should accommodate simple mistakes people make.

epidemian12y ago

> It's also a bit strange to argue about "easier to write manually" in this day and age of Markdown, HAML, etc.

But, by that logic, isn't it also strange to argue about making HTML's syntax more consistent if we should be using Markdown/HAML/etc to generate it anyway?

bhaak12y ago

There is nowadays less need for writing HTML by hand. HTML is more often generated from other formats and things like Markdown, HAML, and other lightweight markup languages helped in that.

But in the end the output is HTML and having a consistent syntax makes it easier to generate, read, and debug it.

Its syntax doesn't need to be dumbed down for casual users because casual users have other options. In that sense I think my comment is in support for a better HTML syntax. :)

mkohlmyr12y ago· 2 in thread

Personally I see saving a few characters here and there as a completely inadequate reason for making the spec less consistent.

I don't think any of the things you mention actually make html considerably more legible or easier to write for a person. Just harder to parse for a machine.

recursive12y ago

How many HTML parsers do you need to write on average?

mkohlmyr12y ago

You're only addressing half my argument. Just because we can invent the wheel once doesn't mean it's a good idea to make the problem it solves overly complicated for no good reason.

1 more reply

kalleboo12y ago· 2 in thread

The main push behind stricter document control is making it easier to make all the browsers render documents consistently.

zimbatm12y ago

DougWebb12y ago

... so that every browser extract the same DOM tree from the same input

That is, every browser whose engine has been updated for HTML5 and which also implements the parser specification correctly and without bugs. Which is most, but not all, of them.

1 more reply

romaniv12y ago

It is plain wrong to make a standard easier for machine-parsing at the expense of humans who are typing it in.

It is far worse to sacrifice consistency of the mental model behind the language for the sake of not typing two extra characters.

Also, the savings on avoiding " /" are offset by the need to needlessly close some of the HTML5 tags.

mcv12y ago

6cxs2hd612y ago

You'd like s-expressions. As in Lisp.

    (ul (li "one") (li "two"))

Although I hadn't heard of the SHORTAG feature before:

    <strong>Hell yea</>

It's a slightly more verbose version of s-expression.

al2o3cr12y ago

"It is plain wrong to make a standard easier for machine-parsing at the expense of humans who are typing it in."

keeperofdakeys12y ago

joesb12y ago

> LIs close before the next LI, or the UL.

Unless you have nested list.

j / k navigate · click thread line to collapse