Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
OpenAI: support for Reinforcement Fine-tuning available to verified orgs
(opens in new tab)
(twitter.com)
1 points
justanotheratom
1y ago
1 comments
Save
Share
1 comments
1 comments · 1 top-level
top
newest
oldest
justanotheratom
OP
1y ago
my question for anyone who knows:
Between SFT, DPO, and RFT, - when to use which? - can we mix and match? e.g, first SFT, then DPO.
j
/
k
navigate · click thread line to collapse