1Pushing memory bound CUDA kernels past the speed of light with data compression (opens in new tab)(fergusfinn.com)2somnial28d ago0Save
2Speculative KV coding: ~4× losslessly compressed KV cache using a small model (opens in new tab)(fergusfinn.com)2somnial1mo ago0Save
4LLM powered data structures: A lock-free binary search tree (opens in new tab)(fergusfinn.com)1somnial5mo ago0Save
5Parallel Primitives for Multi-Agent Workflows (opens in new tab)(fergusfinn.com)1somnial5mo ago0Save