7Systematically generating tests that would have caught Anthropic's top‑K bug (opens in new tab)(theorem.dev)2ag87mo ago0
9Training Qwen to answer briefly yet intelligently using feedback control (opens in new tab)(runrl.com)4ag87mo ago0
10Launch HN: RunRL (YC X25) – Reinforcement learning as a service (opens in new tab)(runrl.com)71ag87mo ago22