undefined | Better HN

0 pointsImJasonH7mo ago0 comments

Is anybody working on making building specialized things easier and cheaper?

0 comments

-_-7mo ago

Yes! At https://RunRL.com we offer hosted RL fine-tuning, so all you need to provide is a dataset and reward function or environment.

selim-now7mo ago

yes! check out https://distillabs.ai/ – follows a similar approach except the evaluation set is held out before the synthetic data generation, which I would argue makes it more robust (I'm affiliated)

j / k navigate · click thread line to collapse

0 comments

-_-7mo ago

Yes! At https://RunRL.com we offer hosted RL fine-tuning, so all you need to provide is a dataset and reward function or environment.

selim-now7mo ago

j / k navigate · click thread line to collapse