Regex is fine for grabbing something in a page that you have looked at yourself.
Parsing millions of pages you don't have this option, you need something robust, a tool that is flexible, that doesn't barf out too many errors, that is quick.
The SO Meta community is such a bunch of bullies that I now hardly go there (even though I recently found two bugs which I did not bother to post). In contrast, the regular channels are pragmatically helpful (pragmatically because you still need to do some God offering sacrifices (called "what effort have you put in the question" and suffer some psychotic down voters). It is interesting to see that both populations are composed from the same individuals who seem to have a personality flip when switching channels.
I would be interested someday to learn about the dynamics of such groups. There are plenty of places on Internet populated by mentally deranged participants (cowards hiding behind Internet) but the SE Meta ones are, I belive, more educated / intelligent in average and, sometimes, more traceable.
I'd even complained about the reps being separate, pointing out how this gave the power on meta to people who didn't even necessarily contribute on the main site. A bunch of high meta-rep users descended, simultaneously shooting down the idea of merging the reps while admitting that the main reason they like the status quo was because they didn't want to lose their precious karma. I was petty surprised when SO eventually fixed this, but it's kind of too little, too late. I don't bother with meta anymore despite now having a high rep on it. Too many bad memories.
Anyway, I find it sad that they are losing some possibly useful feedback in the name of self-adoration. And this particularly because SE is a fantastic source of knowledge, just reading the Hot Topics made me learn about subjects I did have never looked at.
While you obviously can't match arbitrary HTML with a regex (because arbitrary levels of nested elements requires a stack-based parser), can you not match HTML tags with a regex? It seems to be that it should be possible since you always have the pattern '<' followed by the name of the tag, followed my zero or more "key=quoted-val" attributes, and finally a '>' token.
So, if the question is limited to just how to parse a single open token then it seems like all of the answers have just decided to echo what they've heard in the past which is "don't use regular expressions to parse HTML" when the truth is that a real HTML lexer/parser does use regular expressions for creating these "open" and "close" element tokens for the parser.
It really falls under the old joke "you have a problem and you decide to solve it with regex, now you have two problems". HTML is very gnarly, and regex is very gnarly. Doesn't mean you can't get shit done if you're aware of the pitfalls.