If your threat model is "someone cloned the database and can now perform unlimited attacks against the stored passwords", then yeah, word + digit protects just about nothing. Assuming a lexicon of 5,000 words, word+digit gives you about 50,000 variations to try. Say that L337 substitutions give you another 10x factor, so now you have 500,000 candidates for what the password might be. Now lets assume that instead of the stupid crap they did in this video, the folks storing your password did everything right and used bcrypt with a work factor of 12. A cracking rig from a couple years ago can run something like 10,000 hashes per second under these conditions, so it might take a whole minute to discover your password. (Remember this is if they did it
right, most other password storage schemes would yield your password in a fraction of a second.)
Or, we could look at the two-words-separated-by-punctuation case. Same 5,000 word lexicon, maybe 10 different symbols likely to show up between the words. Call that ~250,000,000 possibilities for your password. That'll take up to a day to crack. A day is a long time to spend on one password, but maybe they don't have anything better to do. Maybe they hate you personally. Add another word, suddenly the hackers need years per password, which is obviously uneconomical.
These guidelines don't come out of nowhere, and there isn't really a tower of experts somewhere giggling at the unwashed idiots around them (well, there might be, but I wasn't invited). This is just one of many problems in computing that live around the intersection of math and psychology, where the "natural" thing to do is (unintuitively) quite dangerous.