1Matrix Multiplications on GPUs Run Faster When Given Predictable Data (opens in new tab)(thonking.ai)4chillee2y ago0Save
3Supporting Mixtral in GPT-fast through torch.compile (opens in new tab)(thonking.substack.com)1chillee2y ago0Save