1Absolute Zero: Reinforced Self-Play Reasoning with Zero Data (opens in new tab)(arxiv.org)arXiv88leodriesch1y ago19Save
2Does RL Incentivize Reasoning in LLMs Beyond the Base Model? (opens in new tab)(limit-of-rlvr.github.io)84leodriesch1y ago38Save
4Grok, an AI Modeled After the Hitchhiker's Guide to the Galaxy (opens in new tab)(twitter.com)5leodriesch2y ago2Save
5The Rome tools project is officially discontinued (opens in new tab)(twitter.com)4leodriesch2y ago0Save
7GPT-4 Code Interpreter model is much better than GPT4 (opens in new tab)(twitter.com)1leodriesch2y ago0Save