1Accelerating LLM Inference with Parallel Draft Models (PARD) (opens in new tab)(amd.com)1dhruvdh1y ago0Save
2Open-sourcing Three EXAONE 3.5 Models: 2.4B, 7.8B, 32B (opens in new tab)(lgresearch.ai)13dhruvdh1y ago4Save
3The Tyranny of Possibilities in the Design of Task-Oriented LLM Systems (opens in new tab)(arxiv.org)arXiv1dhruvdh2y ago3Save
4MoAI Platform – Scale PyTorch, TensorFlow, etc. to Thousands of GPU/NPUs (opens in new tab)(moreh.io)1dhruvdh2y ago0Save
5Lamini LLM Finetuning on AMD ROCm: A Technical Recipe (opens in new tab)(lamini.ai)6dhruvdh2y ago4Save
6ModuleFormer: Modularity Emerges from Mixture-of-Experts (opens in new tab)(arxiv.org)arXiv1dhruvdh2y ago1Save