1High Performance Distributed Inference with Ray Serve LLM (opens in new tab)(anyscale.com)3robertnishihara7d ago0Save
2Data Processing Is Becoming a GPU Workload (opens in new tab)(anyscale.com)2robertnishihara9d ago0Save
367% Cost Savings with PD Disaggregation Using Ray and vLLM on AMD MI325X (opens in new tab)(anyscale.com)4robertnishihara9d ago0Save
4Major upgrades to Ray Serve: 88% lower latency and 11.1x higher throughput (opens in new tab)(anyscale.com)2robertnishihara3mo ago1Save
5SkyRL brings Tinker to your GPUs (2025) (opens in new tab)(novasky-ai.notion.site)24robertnishihara4mo ago5Save
6vLLM large scale serving: DeepSeek 2.2k tok/s/h200 with wide-ep (opens in new tab)(blog.vllm.ai)147robertnishihara5mo ago54Save
7Massively Parallel Agentic Simulations with Ray (opens in new tab)(anyscale.com)2robertnishihara9mo ago0Save
8Deploy DeepSeek‑R1 with VLLM and Ray Serve on Kubernetes (opens in new tab)(anyscale.com)1robertnishihara10mo ago0Save
9An Open Source Stack for AI Compute: Kubernetes and Ray and PyTorch and VLLM (opens in new tab)(anyscale.com)1robertnishihara10mo ago0Save
10Native LLM APIs in Ray Data and Ray Serve (opens in new tab)(anyscale.com)2robertnishihara11mo ago0Save
12AsyncFlow: An Asynchronous Streaming RL Framework for LLM Post-Training (opens in new tab)(arxiv.org)arXiv4robertnishihara11mo ago0Save
14Large-Scale Deployment of Ray in Tencent's Weixin AI Infrastructure (opens in new tab)(anyscale.com)2robertnishihara11mo ago0Save
15Uv and Ray: Pain-Free Python Dependencies in Clusters (opens in new tab)(anyscale.com)44robertnishihara0y ago10Save