Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
Apriel-H1: Towards Efficient Enterprise Reasoning Models
(opens in new tab)
(arxiv.org)
1 points
guiriduro
6mo ago
1 comments
Save
Share
1 comments
1 comments · 1 top-level
top
newest
oldest
guiriduro
OP
6mo ago
Apriel-H1-15b-Thinker-SFT uses incremental distillation from Apriel-Nemotron-15B-Thinker, selectively replacing less critical attention layers with linear Mamba blocks to reduce computational complexity while preserving reasoning quality.
j
/
k
navigate · click thread line to collapse