t55 on Hacker News

1

RL Speedrun (opens in new tab)

(github.com)GitHub

2t555d ago0

2

Target Policy Optimization (opens in new tab)

(arxiv.org)arXiv

1t552mo ago0

3

Show HN: Kilroy – Knowledge base for teams using Claude Code (opens in new tab)

(github.com)GitHub

5t552mo ago0

4

Procedural Reasoning Datasets (opens in new tab)

(github.com)GitHub

1t5510mo ago0

5

In Defence of Gary Marcus (opens in new tab)

(reubenadams.substack.com)

3t5511mo ago0

6

Reasoning Gym – Procedural RL reasoning datasets (opens in new tab)

(github.com)GitHub

1t5511mo ago0

7

ChatGPT Agent [video] (opens in new tab)

(youtube.com)Video

3t5511mo ago0

8

ReasoningGym: Reasoning Environments for RL with Verifiable Rewards (opens in new tab)

(arxiv.org)arXiv

105t551y ago28

9

Show HN: Rehearsal.so, Duolingo for Public Speaking (opens in new tab)

(rehearsal.so)

3t551y ago1

10

End-to-End Vision Tokenizer Tuning (opens in new tab)

(arxiv.org)arXiv

3t551y ago0

11

YC Interview Mock Practice (opens in new tab)

(rehearsal.so)

2t551y ago0

12

D1: Scaling Reasoning in Diffusion LLMs via Reinforcement Learning (opens in new tab)

(dllm-reasoning.github.io)

4t551y ago0

13

Are LLMs more than autocomplete? AI Debate (opens in new tab)

(rehearsal.so)

1t551y ago0

14

Block Diffusion: Interpolating Autoregressive and Diffusion Language Models (opens in new tab)

(m-arriola.com)

72t551y ago16

15

How to stay in flow while using Cursor or Windsurf (opens in new tab)

(rehearsal.so)

2t551y ago0

t55

Recent submissions

RL Speedrun (opens in new tab)

Target Policy Optimization (opens in new tab)

Show HN: Kilroy – Knowledge base for teams using Claude Code (opens in new tab)

Procedural Reasoning Datasets (opens in new tab)

In Defence of Gary Marcus (opens in new tab)

Reasoning Gym – Procedural RL reasoning datasets (opens in new tab)

ChatGPT Agent [video] (opens in new tab)

ReasoningGym: Reasoning Environments for RL with Verifiable Rewards (opens in new tab)

Show HN: Rehearsal.so, Duolingo for Public Speaking (opens in new tab)

End-to-End Vision Tokenizer Tuning (opens in new tab)

YC Interview Mock Practice (opens in new tab)

D1: Scaling Reasoning in Diffusion LLMs via Reinforcement Learning (opens in new tab)

Are LLMs more than autocomplete? AI Debate (opens in new tab)

Block Diffusion: Interpolating Autoregressive and Diffusion Language Models (opens in new tab)

How to stay in flow while using Cursor or Windsurf (opens in new tab)

Recent submissions

RL Speedrun (opens in new tab)

Target Policy Optimization (opens in new tab)

Show HN: Kilroy – Knowledge base for teams using Claude Code (opens in new tab)

Procedural Reasoning Datasets (opens in new tab)

In Defence of Gary Marcus (opens in new tab)

Reasoning Gym – Procedural RL reasoning datasets (opens in new tab)

ChatGPT Agent [video] (opens in new tab)

ReasoningGym: Reasoning Environments for RL with Verifiable Rewards (opens in new tab)

Show HN: Rehearsal.so, Duolingo for Public Speaking (opens in new tab)

End-to-End Vision Tokenizer Tuning (opens in new tab)

YC Interview Mock Practice (opens in new tab)

D1: Scaling Reasoning in Diffusion LLMs via Reinforcement Learning (opens in new tab)

Are LLMs more than autocomplete? AI Debate (opens in new tab)

Block Diffusion: Interpolating Autoregressive and Diffusion Language Models (opens in new tab)

How to stay in flow while using Cursor or Windsurf (opens in new tab)