Kat-Dev-32B, Kat-Coder with Scalable Agentic RL (opens in new tab)

(kwaipilot.github.io)

1 pointsrobert-zaremba8mo ago1 comments

1 comments

1 comments · 1 top-level

KAT-Dev-32B and KAT-Coder are optimized via several stages of training, including a mid-training stage, supervised fine-tuning (SFT) & reinforcement fine-tuning (RFT) stage and an large-scale agentic reinforcement learning (RL) stage.

j / k navigate · click thread line to collapse