undefined | Better HN

0 pointskcorbitt1y ago0 comments

To be honest, I don't expect the performance to generalize to other task types with this specific training regime. If we had a panel of like 30 logic puzzles and cross-trained against all of them simultaneously it might though.

I think there's a lot of benefit to discovering a training regime that allows small specialized models to do extremely well in one narrow task; if we can figure out how to make small models that beat SOTA on a specific task and are cheap to train and run, that's in some ways a more useful outcome than a very large model that is good at many tasks (but is more expensive to run for each of them).

0 comments

3 comments · 2 top-level

ekidd1y ago· 1 in thread

Once the problem gets narrow enough, do you risk training a model that reinvents a straightforward classic algorithm at far higher cost?

bradhilton1y ago

Well, in this case there is a much more straightforward method with the same CP-SAT solver used to create the puzzles. This is more of a fun experiment to see if we can train LLMs to solve these kinds of logical deduction problems.

shinryuu1y ago

The question to me if you can call that deduction in that case. Isn't it just a type of pattern matching that fits this particular task?

j / k navigate · click thread line to collapse