Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
0 points
phreeza
4mo ago
0 comments
Save
Share
But this is missing exactly the gap which OP seems to have, which is going from a next token predictor (a language model in the classical sense) to an instruction finetuned, RLHF-ed and "harnessed" tool?
0 comments
1 comments · 1 top-level
top
newest
oldest
js8
4mo ago
The book has a sequel
https://www.manning.com/books/build-a-reasoning-model-from-s...
It will give you an answer to the extent anybody can.
j
/
k
navigate · click thread line to collapse