1New deepseek paper: Natively Trainable Sparse Attention mechanism (opens in new tab)(twitter.com)5redlock1y ago1