undefined | Better HN

0 pointsgjm1114y ago0 comments

Be aware that adding to the length simply by taking more of the lyrics adds very little entropy. If you're trying "Oh say can you see" then it doesn't take a lot of extra bits also to try "Oh say can you see by the dawn's early light what so proudly we hailed at the twilight's last gleaming".

Similarly, extended passages of text -- even if they don't come from a restricted corpus like that of song lyrics -- have less entropy than you'd think. A smaller number of independent random words is likely to be a better tradeoff.

0 comments

11 comments · 3 top-level

benmathes14y ago· 4 in thread

I can see your point in that the kolmogorov complexity of two lines in a song isn't much larger than one line. Similarly, 30 digits of pi and 300 digits of pi have very little difference in kolmogorov complexity.

What I don't know is if state-of-the-art password guessers are great at recognizing larger patterns in the entire canon of human knowledge. I.e. is there a "common phrases" attack that's analogous to a "dictionary attack"?

jfriedly14y ago

Google released the world's largest corpus and did us a favor by analyzing it for n-grams. For example, they found that the phrase "serve as the initial" was over a 100 times more common than the phrase "serve as the insurance". [1] For $150 you can buy the 24GB data set yourself, so it's a fair assumption that makers of password crackers could reliably guess common phrases first. [2]

[1] http://googleresearch.blogspot.com/2006/08/all-our-n-gram-ar... [2] http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=...

singlow14y ago

If these types of passwords become popular, brute force crackers will build dictionaries of well known phrases.

spydum14y ago

That may be true, but we still end up better off. The compute time for the password cracker has gone up quite a bit, making it a more expensive endeavor (they've got to build dictionaries for both WKP's and passwords with fuzzing). It doesn't solve the problem, but it's a start in the right direction (away from fuzzing of dictionary words, which is clearly bad for human memory, and good for password crackers efficiency).

However, when using randomly chosen dictionary words to build phrases (not well known), the entropy shoots well above the level of being reasonable to crack in a lifetime.

nait14y ago

Given that the knowledge about correct parts of a password based on known sources (pi, peace and war, song lyrics etc) drastically reduces the amount of possible solutions. But how would an attacker figure out the first part of such a password? What comes to mind are timing attacks http://en.wikipedia.org/wiki/Timing_attack What other possibilities did I miss?

EDIT: I get that having a long streak of my pass in a dictionary would reduce overall security but it's still unclear how a partial match in the dictionary would be detected.

ScottBurson14y ago· 4 in thread

But there's a long tail of song lyrics. If you pick something obscure, the odds of the attacker even having heard of it become very small (particularly if the attacker is from a different culture than your own). Pick something arty and incomprehensible, and the odds against someone else accidentally stringing those words together in some other context become astronomical.

For instance, I'd wager no cracker has ever heard the song containing the line "We barter images on the matrix". And that's one of the more intelligible lines from the song in question (from a 1978 album by the little-known prog-rock group Happy The Man). Pull it up on Google and you'll see what I mean.

If you don't know the song, of course, lines from it will be about as hard to remember as randomly chosen words. But if you do know it, you have a good mnemonic.

MichaelGagnon14y ago

This gets into the whole "security through obscurity" thing. Ideally, you should use a password-generation system such that if the attacker knows your pasword-generation system (e.g. lines from songs) it would still be infeasible to guess your actual password.

Thats why the 4-random-words technique is good. According to XKCD, the 4-random-words technique generates about 17 trillion passwords---all equally likely.

But even with a long tail, song-lyric passwords relies on obscurity. I imagine there are much fewer than 17 trillion songs to choose from. And if the attacker knew some information about you (say from looking at your Facebook profile or your search history) I'm sure it could drastically weed out the search space.

JonnieCache14y ago

The answer is obviously to write your own song or poem and not tell anyone about it. A passpoem, perhaps in the style of Lewis Carrol.

LXicon14y ago

there might not be 17 trillion songs, but you aren't limited to the first 4 words of the song. there might be 100-300 words per song and you can pick your starting word anywhere you like.

1 more reply

alanh14y ago

What happens when your obscure song makes the soundtrack for a hit movie next summer?

akronim14y ago

given how prone people are to mis-hearing song lyrics, the corpus isn't the full text of all published song lyrics as you suggest.

j / k navigate · click thread line to collapse