3TournO: Tournament Optimization for Non-Verifiable RL (opens in new tab)(github.com)GitHub3leonardtang3mo ago0Save
4j1-micro and j1-nano: Tiny (0.6B, 1.7B) and Mighty Reward Models (opens in new tab)(github.com)GitHub3leonardtang1y ago0Save
5Verdict: A Library for Scaling Judge-Time Compute (opens in new tab)(twitter.com)3leonardtang1y ago0Save
8Cascade: A fast, automated, multi-turn LLM jailbreaking method (opens in new tab)(twitter.com)2leonardtang1y ago0Save
13Sphynx: Fuzz Testing Hallucination Detection Models (opens in new tab)(github.com)GitHub2leonardtang1y ago0Save
15Thorn in a HaizeStack test for evaluating long-context adversarial robustness (opens in new tab)(github.com)GitHub19leonardtang2y ago11Save