MiniMax M3 Benchmarks One Pager (opens in new tab)

(filecdn.minimax.chat)

3 pointskirtivr22d ago1 comments

1 comments

1 comments · 1 top-level

Some super relevant benchmarks like Humanity's Last Exam, Long context reasoning (MRCR 128K-256K) are not included.

Overall this seems to be a strong agent-oriented model. What are the benchmarks that most closely track model coding performance in the real world?

j / k navigate · click thread line to collapse