2Mamba Explained: The State Space Model Taking On Transformers (opens in new tab)(kolaayonrinde.com)270koayon2y ago93Save
3The Frontier of Adaptive Computation in Machine Learning (opens in new tab)(github.com)GitHub1koayon2y ago0Save
4DeepSpeed's Bag of Tricks for Training Large Models (opens in new tab)(kolaayonrinde.com)1koayon2y ago0Save