1XGrammar: Efficient, Flexible and Portable Structured Generation for LLM (opens in new tab)github.com12ruihangl1y ago1
2High-Throughput Low-Latency LLM Serving with MLCEngine (opens in new tab)blog.mlc.ai8ruihangl1y ago0
4Run Llama2-70B in Web Browser with WebGPU Acceleration (opens in new tab)webllm.mlc.ai9ruihangl2y ago6