Skip to content
Better HN
Achieving 3X speedups on Google TPUs with diffusion-style speculative decoding | Better HN