What a crossword AI reveals about humans' way with words (opens in new tab)

(wired.com)

70 pointsgsjbjt4y ago49 comments

49 comments

34 comments · 8 top-level

isolli4y ago· 10 in thread

"But, unlike more than 200 human solvers, it wasn’t perfect on all of the puzzles: It got waylaid on two of them and finished with errors."

I'm curious, what kind of error are we talking about? Words that don't exist, or another solution to a problem that may not have a unique solution?

tzs4y ago

I wonder how it does on puzzles where the answers need to be written in unusual ways? Some examples from the New York Times puzzles.

- There was one with a name that suggested an Alice in Wonderland connection, and it had an answer "THE LOOKING GLASS" (no spaces) running vertically down the full length of the center of the grid.

Every across answer that was entirely to the left of that was written normally. Every across answer entirely to the right was written backwards. Every across answer that crossed the center was a palindrome centered on the center.

- There was one where several answers were triple Spoonerisms of well known phrases.

For example, the answer "THE STUCK HOPS BEER" for the clue "Tagline in an ad for Elmer's Glue-Ale". Rotate the ST from STUCK, the H from HOPS, and the B from BEER and you get "THE BUCK STOPS HERE".

- I remember one that had a few isolated black squares, and a theme that suggested the puzzle had something to do with roundabouts.

Those black squares were roundabouts. Answers would hit the roundabout and continue after a 90 degree turn.

- I remember one where the theme was something like "What goes up must come down". That had several answers that, like the roundabout one above, would take a 90 degree turn from across but it was a left turn so after the turn they went up. Then there would be a down answer whose start was the reversed end of that across answer that would come down, also make a left tern, and continue across to the right.

uhoh-itsmaciek4y ago

Those are great. Another memorable NYT one had the theme A SHOT IN THE DARK, and several answers ended in SHOT (RIMSHOT, EARSHOT, etc.), but the grid only had space for the first part of the word, so the actual answers were RIM⬛, EAR⬛, etc. Another had a RISE FROM THE ASHES theme, and had several answers that ended in ASH that did not seem to fit the clue: the actual answer followed vertically where the ASH ending started. E.g., for the clue "Tell it like it is", the answer that fit straight across was "TALKS TRASH", but above the "A", going up, were I, G, H, and T, giving the actual answer "TALK STRAIGHT".

ptero4y ago

Non-existent words should be easy to check for. My guess is answers not (quite) matching the clues. Analogies and cultural references are probably hard for AI to get: is an "empire fighting knight" historical figure or Jedi? One level of such referencing is doable, 2-3 likely very hard.

tasogare4y ago

> Non-existent words should be easy to check for.

I'm interested in your magical solution because this is a hard problem.

1 more reply

banana_giraffe4y ago

A similar thing happens even with human solvers from time to time. It's possible to solve a puzzle with answers that match the clue but aren't what was intended. Of course, the more you let yourself deviate from the clue, the more possibilities exist.

My favorite example of using this is the 1996-11-05, the day of the presidential election, NY Times crossword with the clue "Lead story in tomorrow's newspaper (!), with 43-Across". The puzzle worked with both "CLINTON" or "BOBDOLE" in the crossword.

djrockstar14y ago

Also, as a novice to crosswords, how do puzzle makers ensure that they have a unique solution?

OscarCunningham4y ago

Related, this crossword designed to have two solutions: https://imgur.com/Atd5Dry

bloak4y ago

In a traditional cryptic crossword each individual word has at least two independent things in the clue referring to it, for example, in an unrealistically simple case, "it means X" and "it's an anagram of Y", so each individual answer is very likely to be unique. In addition, alternate letters of each answer are part of another word, so the whole thing is extremely likely to have a unique solution.

However, with a combination of a bad clue and a bit of bad luck it does sometimes happen that an experienced crossword solver has to take a guess before sending in their solution. I'm not an experienced crossword solver, but I've seen them at work, and I would guess that finding a good alternative solution happens less than one time in 50. That's just a guess, though. For a better estimate one should systematically compare the solutions produced by competent crossword solvers with the official solutions.

tsimionescu4y ago

In a crossword puzzle, each word shares letters with at least two other words, likely many more. So, without any effort from the puzzle desogner, it is very unlikely to actually have all of the words have ambiguous enough solutions that match the letter count etc. Even a crossword where half the clues are 'any word' will have a good chance of having a unique solution overall.

Of course, people don't like to play crosswords like sudoku, blindly matching letters, so there does have to be some skill to the clue design to have a relatively small amount of plausible answers, if not being entirely unique.

tclancy4y ago

They don’t necessarily. Some times this is done on purpose at harder difficulties, the clueing is intentionally vague to lead you down blind alleys.

weeblewobble4y ago· 6 in thread

A few years back I wrote a program that, given a blank or partially-filled in (NYT-style) crossword grid (with black squares already inserted), could fill out the rest of the grid with valid words/phrases both across and down. It had no relation to clues though, you had to write the clues yourself after the grid was filled out.

I wrote it because I wanted to make my dad (a huge nyt crossword fan) a custom crossword for his birthday. I put in a bunch of phrases related to him and our family and let the program fill in the rest. It was a huge hit, never really went back to it though. Anyone know if anything else like this exists?

pessimizer4y ago

Don't use that as a final product, use it as a crossword book generator. Selling the books would be a lot more lucrative (or selling the actual puzzles to someone who already prints crosswords.)

Especially if you can generate them from arbitrary lists of of clues:words.

xkeysc0re4y ago

There is Phil, the free crossword maker http://www.keiranking.com/apps/phil/ but I haven't had the best luck with its automated fill. Lately I've been using Crossfire http://beekeeperlabs.com/crossfire/ to make puzzles and its auto fill is quite versatile, even if some of the choices can be iffy (but easily fixed with some manual tweaking). You can even provide your own custom dictionary for it to use in the fill.

weeblewobble4y ago

The Phil auto-fill button doesn't do anything for me. Crossfire looks like it has this feature though, and it's probably a lot faster than mine

peab4y ago

I did the same thing, as a side project when covid first hit. It's a fun and difficult problem to solve efficiently. I remember seeing some videos on YouTube of algorithms that could do it as well.

weeblewobble4y ago

nice, what did you end up going with? I built a "scoring" function for candidates based on how many other words can intersect each letter (based on grid layout and squares already filled) plus a backtracking system. ended up working pretty well, although each grid took >1 minute. I used an NYT historical word list as the dictionary

1 more reply

pwdisswordfish04y ago

I’ve used QXW very successfully.

The trick is to have a good word list… I’m working on one for German, but it’s a bit of an undertaking and I guess pretty subjective.

bvm4y ago· 4 in thread

I would be interested to see how they perform on UK-style cryptic crosswords (https://en.wikipedia.org/wiki/Cryptic_crossword) where there's much more play/subversion/meta-approach to the "rules".

sweezyjeezy4y ago

I think it's pretty unlikely you'll be able to learn the UK rules to a human level using brute force ML - take a fairly easy clue like ‘Drunken men are more despicable' - you need to split this as (wordplay = drunken (anagram of) "men are") / (meaning = "more despicable") = "MEANER"

This is really hard for supervised learning - the reward is quite sparse (e.g. did you get it right / how many characters did you get right) but the task is reasonably complex (e.g. it would have to learn how to spot/execute arbitrary length anagrams on its own, already something that is nontrivial for ML). Sparse + complex usually means gradient descent will fail or converge to a more trivial minima e.g. only look for synonyms in the clue.

I reckon you would have to codify the different types of cryptic clues manually for this work well.

ximeng4y ago

https://www.crosswordgenius.com/ not sure how this is done but it looks pretty good.

https://unlikely.ai/cryptic-crossword-genius-unlikely-ai-art... has a little more detail.

https://crosswordgenius.com/clue/who-may-get-drunk-with-real... example solution with anagram.

2 more replies

jameshart4y ago

You might not need to codify the rules; you could create a tagged training set which includes additional information, like a ‘parts of speech’ breakdown of how a clue relates to an answer (anagramSignifier - anagramMaterial - association, etc.)

An unsupervised learner could probably do reasonably well at picking up on those patterns, even to the point of considering that a word it hasn’t seen used as an anagram signifier before might be doing that job in a particular clue.

On the other hand, machine translation learning has moved away from using tagged parts of speech as far as I’m aware, and it has nonetheless managed to develop sufficient internal modeling that it is as if it has learned parts-of-speech tagging; it’s possible that an ML on a cryptic clue corpus could develop those same hidden models.

1 more reply

r_c_a_d4y ago

Yes, and fewer crossing letters, so there are usually many more words that will "fit".

bvm4y ago· 4 in thread

> The crossword solver is a closed system—it can’t just Google the answers.

yet from the wiki:

> Probabilities for individual words or phrases in the puzzle are computed using relatively simple statistical techniques based on features such as previous appearances of the clue, number of Google hits for the fill. https://en.wikipedia.org/wiki/Dr.Fill

I don't think it's fair to describe it as a closed-system if it's Googling stuff, even for pruning.

karamanolev4y ago

Can't the number of Google hits for all words in the dictionary be enumerated and stored before entry into the competition? Wouldn't that keep it a closed system, even if it uses metrics like that?

wongarsu4y ago

Humans are allowed to prepare however they like, so anything from the real world should be fair game as long as you don't add anything during the solving process itself.

SamBam4y ago

It's not googling stuff live, it's using a pre-computed database.

EGreg4y ago

Sounds like how AlphaGo beat Stockfish or Rybka… massive precomputing vs alphabeta search in real time

1 more reply

wirthjason4y ago· 2 in thread

This makes me curious, how are crossword puzzles made? Clues -> words? Or working backwards from words to clues? Some blend of the two?

banana_giraffe4y ago

The crossword writer I've seen discuss it, generally start with theme words, then fill in the rest of the words, then work on clues.

Then an editor will refine the clues to fit whatever criteria they may have for publication.

ygra4y ago

There was a video next to the article for me that discussed exactly that ... https://youtu.be/aAqQnXHd7qk

dom1114y ago

https://archive.is/OuoeZ

wirthjason4y ago

For some reason only the first paragraph was shown. I had to view it in privacy mode to see the whole article. I wonder if the ad blocker on my iPhone triggered something.

eutectic4y ago

I wonder how well gpt-3 + search would perform as a crossword-solving agent. I guess the tokenization might be tricky to deal with.

j / k navigate · click thread line to collapse

49 comments

34 comments · 8 top-level

isolli4y ago· 10 in thread

"But, unlike more than 200 human solvers, it wasn’t perfect on all of the puzzles: It got waylaid on two of them and finished with errors."

I'm curious, what kind of error are we talking about? Words that don't exist, or another solution to a problem that may not have a unique solution?

tzs4y ago

I wonder how it does on puzzles where the answers need to be written in unusual ways? Some examples from the New York Times puzzles.

- There was one with a name that suggested an Alice in Wonderland connection, and it had an answer "THE LOOKING GLASS" (no spaces) running vertically down the full length of the center of the grid.

- There was one where several answers were triple Spoonerisms of well known phrases.

For example, the answer "THE STUCK HOPS BEER" for the clue "Tagline in an ad for Elmer's Glue-Ale". Rotate the ST from STUCK, the H from HOPS, and the B from BEER and you get "THE BUCK STOPS HERE".

- I remember one that had a few isolated black squares, and a theme that suggested the puzzle had something to do with roundabouts.

Those black squares were roundabouts. Answers would hit the roundabout and continue after a 90 degree turn.

uhoh-itsmaciek4y ago

ptero4y ago

tasogare4y ago

> Non-existent words should be easy to check for.

I'm interested in your magical solution because this is a hard problem.

1 more reply

banana_giraffe4y ago

djrockstar14y ago

Also, as a novice to crosswords, how do puzzle makers ensure that they have a unique solution?

OscarCunningham4y ago

Related, this crossword designed to have two solutions: https://imgur.com/Atd5Dry

bloak4y ago

tsimionescu4y ago

tclancy4y ago

They don’t necessarily. Some times this is done on purpose at harder difficulties, the clueing is intentionally vague to lead you down blind alleys.

weeblewobble4y ago· 6 in thread

pessimizer4y ago

Don't use that as a final product, use it as a crossword book generator. Selling the books would be a lot more lucrative (or selling the actual puzzles to someone who already prints crosswords.)

Especially if you can generate them from arbitrary lists of of clues:words.

xkeysc0re4y ago

weeblewobble4y ago

The Phil auto-fill button doesn't do anything for me. Crossfire looks like it has this feature though, and it's probably a lot faster than mine

peab4y ago

I did the same thing, as a side project when covid first hit. It's a fun and difficult problem to solve efficiently. I remember seeing some videos on YouTube of algorithms that could do it as well.

weeblewobble4y ago

1 more reply

pwdisswordfish04y ago

I’ve used QXW very successfully.

The trick is to have a good word list… I’m working on one for German, but it’s a bit of an undertaking and I guess pretty subjective.

bvm4y ago· 4 in thread

I would be interested to see how they perform on UK-style cryptic crosswords (https://en.wikipedia.org/wiki/Cryptic_crossword) where there's much more play/subversion/meta-approach to the "rules".

sweezyjeezy4y ago

I reckon you would have to codify the different types of cryptic clues manually for this work well.

ximeng4y ago

https://www.crosswordgenius.com/ not sure how this is done but it looks pretty good.

https://unlikely.ai/cryptic-crossword-genius-unlikely-ai-art... has a little more detail.

https://crosswordgenius.com/clue/who-may-get-drunk-with-real... example solution with anagram.

2 more replies

jameshart4y ago

1 more reply

r_c_a_d4y ago

Yes, and fewer crossing letters, so there are usually many more words that will "fit".

bvm4y ago· 4 in thread

> The crossword solver is a closed system—it can’t just Google the answers.

yet from the wiki:

I don't think it's fair to describe it as a closed-system if it's Googling stuff, even for pruning.

karamanolev4y ago

Can't the number of Google hits for all words in the dictionary be enumerated and stored before entry into the competition? Wouldn't that keep it a closed system, even if it uses metrics like that?

wongarsu4y ago

Humans are allowed to prepare however they like, so anything from the real world should be fair game as long as you don't add anything during the solving process itself.

SamBam4y ago

It's not googling stuff live, it's using a pre-computed database.

EGreg4y ago

Sounds like how AlphaGo beat Stockfish or Rybka… massive precomputing vs alphabeta search in real time

1 more reply

wirthjason4y ago· 2 in thread

This makes me curious, how are crossword puzzles made? Clues -> words? Or working backwards from words to clues? Some blend of the two?

banana_giraffe4y ago

The crossword writer I've seen discuss it, generally start with theme words, then fill in the rest of the words, then work on clues.

Then an editor will refine the clues to fit whatever criteria they may have for publication.

ygra4y ago

There was a video next to the article for me that discussed exactly that ... https://youtu.be/aAqQnXHd7qk

dom1114y ago

https://archive.is/OuoeZ

wirthjason4y ago

For some reason only the first paragraph was shown. I had to view it in privacy mode to see the whole article. I wonder if the ad blocker on my iPhone triggered something.

eutectic4y ago

I wonder how well gpt-3 + search would perform as a crossword-solving agent. I guess the tokenization might be tricky to deal with.

j / k navigate · click thread line to collapse