3Show HN: RULER – Easily apply RL to any agent (opens in new tab)(openpipe.ai)81kcorbitt11mo ago11Save
5Show HN: ART – a new open-source RL framework for training agents (opens in new tab)(github.com)GitHub116kcorbitt1y ago12Save
6ART·E: how we built an email research agent that beats o3 (opens in new tab)(openpipe.ai)3kcorbitt1y ago2Save
7Using GRPO to Beat o1, o3-mini and R1 at “Temporal Clue” (opens in new tab)(openpipe.ai)199kcorbitt1y ago55Save
8Analyzing OpenAI's Reinforcement Fine-Tuning: Less Data, Better Results (opens in new tab)(openpipe.ai)4kcorbitt1y ago0Save
9Using reinforcement learning and $4.80 of GPU time to find the best HN post (opens in new tab)(openpipe.ai)217kcorbitt1y ago95Save
10Show HN: Agent.exe, a cross-platform app to let 3.5 Sonnet control your machine (opens in new tab)(github.com)GitHub406kcorbitt1y ago232Save
12OpenPipe Mixture of Agents: Outperform GPT-4 at 1/25th the Cost (opens in new tab)(openpipe.ai)13kcorbitt2y ago2Save