1Attention at Constant Cost per Token via Symmetry-Aware Taylor Approximation (opens in new tab)(arxiv.org)164fheinsen3mo ago96
2A non-diagonal SSM RNN computed in parallel without requiring stabilization (opens in new tab)(github.com)9fheinsen7mo ago1