7Systematically generating tests that would have caught Anthropic's top‑K bug (opens in new tab)(theorem.dev)2ag85mo ago0
9Training Qwen to answer briefly yet intelligently using feedback control (opens in new tab)(runrl.com)4ag86mo ago0
10Launch HN: RunRL (YC X25) – Reinforcement learning as a service (opens in new tab)(runrl.com)71ag86mo ago22