1AVX-512: First Impressions on Performance and Programmability (opens in new tab)(shihab-shahriar.github.io)125shihab5mo ago53Save
2Mojo: MLIR-Based Performance-Portable HPC Science Kernels on GPUs (opens in new tab)(arxiv.org)arXiv6shihab8mo ago0Save
3Efficient and Portable Mixture-of-Experts Communication (opens in new tab)(perplexity.ai)1shihab1y ago0Save
4Chris Lattner- Democratizing AI Compute: What about AI Compilers (TVM and XLA)? (opens in new tab)(modular.com)2shihab1y ago0Save
5Democratizing AI Compute, Part 5: What about CUDA C++ alternatives? (opens in new tab)(modular.com)2shihab1y ago0Save