1Carbon: Autoregressive Genomic Foundation Model (opens in new tab)(huggingface.co)7kashifr1mo ago1Save
2The ultimate guide to RL environments: building and scaling them in the LLM era (opens in new tab)(huggingface.co)7kashifr1mo ago0Save
4Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries (opens in new tab)(huggingface.co)2kashifr3mo ago0Save
6The Smol Training Playbook: The Secrets to Building World-Class LLMs (opens in new tab)(huggingface.co)265kashifr7mo ago19Save
7Unlocking On-Policy Distillation for Any Model Family (opens in new tab)(huggingface.co)6kashifr7mo ago1Save
9Smollm3: Smol, multilingual, long-context reasoner LLM (opens in new tab)(huggingface.co)388kashifr11mo ago79Save
11AIMO (AI Math Olympiad) progress prize winning solution (opens in new tab)(huggingface.co)9kashifr1y ago0Save
12MaPO: A reference-free alignment technique for diffusion models (opens in new tab)(mapo-t2i.github.io)2kashifr2y ago1Save
13OpenHermesPreferences: Dataset of ~1M AI preferences from teknium/OpenHermes-2.5 (opens in new tab)(huggingface.co)7kashifr2y ago1Save