robertnishihara on Hacker News

1

High Performance Distributed Inference with Ray Serve LLM (opens in new tab)

(anyscale.com)

3robertnishihara7d ago0

2

Data Processing Is Becoming a GPU Workload (opens in new tab)

(anyscale.com)

2robertnishihara9d ago0

3

67% Cost Savings with PD Disaggregation Using Ray and vLLM on AMD MI325X (opens in new tab)

(anyscale.com)

4robertnishihara9d ago0

4

Major upgrades to Ray Serve: 88% lower latency and 11.1x higher throughput (opens in new tab)

(anyscale.com)

2robertnishihara3mo ago1

5

SkyRL brings Tinker to your GPUs (2025) (opens in new tab)

(novasky-ai.notion.site)

24robertnishihara4mo ago5

6

vLLM large scale serving: DeepSeek 2.2k tok/s/h200 with wide-ep (opens in new tab)

(blog.vllm.ai)

147robertnishihara5mo ago54

7

Massively Parallel Agentic Simulations with Ray (opens in new tab)

(anyscale.com)

2robertnishihara9mo ago0

8

Deploy DeepSeek‑R1 with VLLM and Ray Serve on Kubernetes (opens in new tab)

(anyscale.com)

1robertnishihara10mo ago0

9

An Open Source Stack for AI Compute: Kubernetes and Ray and PyTorch and VLLM (opens in new tab)

(anyscale.com)

1robertnishihara10mo ago0

10

Native LLM APIs in Ray Data and Ray Serve (opens in new tab)

(anyscale.com)

2robertnishihara11mo ago0

11

Joins and Hash-Shuffle in Ray Data (opens in new tab)

(anyscale.com)

3robertnishihara11mo ago0

12

AsyncFlow: An Asynchronous Streaming RL Framework for LLM Post-Training (opens in new tab)

(arxiv.org)arXiv

4robertnishihara11mo ago0

13

Open Source RL Libraries for LLMs (opens in new tab)

(anyscale.com)

1robertnishihara11mo ago0

14

Large-Scale Deployment of Ray in Tencent's Weixin AI Infrastructure (opens in new tab)

(anyscale.com)

2robertnishihara11mo ago0

15

Uv and Ray: Pain-Free Python Dependencies in Clusters (opens in new tab)

(anyscale.com)

44robertnishihara0y ago10

robertnishihara

Recent submissions

High Performance Distributed Inference with Ray Serve LLM (opens in new tab)

Data Processing Is Becoming a GPU Workload (opens in new tab)

67% Cost Savings with PD Disaggregation Using Ray and vLLM on AMD MI325X (opens in new tab)

Major upgrades to Ray Serve: 88% lower latency and 11.1x higher throughput (opens in new tab)

SkyRL brings Tinker to your GPUs (2025) (opens in new tab)

vLLM large scale serving: DeepSeek 2.2k tok/s/h200 with wide-ep (opens in new tab)

Massively Parallel Agentic Simulations with Ray (opens in new tab)

Deploy DeepSeek‑R1 with VLLM and Ray Serve on Kubernetes (opens in new tab)

An Open Source Stack for AI Compute: Kubernetes and Ray and PyTorch and VLLM (opens in new tab)

Native LLM APIs in Ray Data and Ray Serve (opens in new tab)

Joins and Hash-Shuffle in Ray Data (opens in new tab)

AsyncFlow: An Asynchronous Streaming RL Framework for LLM Post-Training (opens in new tab)

Open Source RL Libraries for LLMs (opens in new tab)

Large-Scale Deployment of Ray in Tencent's Weixin AI Infrastructure (opens in new tab)

Uv and Ray: Pain-Free Python Dependencies in Clusters (opens in new tab)

Recent submissions

High Performance Distributed Inference with Ray Serve LLM (opens in new tab)

Data Processing Is Becoming a GPU Workload (opens in new tab)

67% Cost Savings with PD Disaggregation Using Ray and vLLM on AMD MI325X (opens in new tab)

Major upgrades to Ray Serve: 88% lower latency and 11.1x higher throughput (opens in new tab)

SkyRL brings Tinker to your GPUs (2025) (opens in new tab)

vLLM large scale serving: DeepSeek 2.2k tok/s/h200 with wide-ep (opens in new tab)

Massively Parallel Agentic Simulations with Ray (opens in new tab)

Deploy DeepSeek‑R1 with VLLM and Ray Serve on Kubernetes (opens in new tab)

An Open Source Stack for AI Compute: Kubernetes and Ray and PyTorch and VLLM (opens in new tab)

Native LLM APIs in Ray Data and Ray Serve (opens in new tab)

Joins and Hash-Shuffle in Ray Data (opens in new tab)

AsyncFlow: An Asynchronous Streaming RL Framework for LLM Post-Training (opens in new tab)

Open Source RL Libraries for LLMs (opens in new tab)

Large-Scale Deployment of Ray in Tencent's Weixin AI Infrastructure (opens in new tab)

Uv and Ray: Pain-Free Python Dependencies in Clusters (opens in new tab)