ag8 on Hacker News

1

Gourmand Syndrome (opens in new tab)

(en.wikipedia.org)

27ag84mo ago9

2

guys why does armenian completely break Claude (opens in new tab)

(twitter.com)

99ag85mo ago65

3

Sampling at negative temperature (opens in new tab)

(cavendishlabs.org)

203ag85mo ago60

4

Perfectly Replicating Coca Cola [video] (opens in new tab)

(youtube.com)Video

1ag85mo ago1

5

Po.ta.to (opens in new tab)

(po.ta.to)

4ag87mo ago2

6

Scaling pretraining affects RL sample efficiency (opens in new tab)

(runrl.com)

1ag88mo ago0

7

Systematically generating tests that would have caught Anthropic's top‑K bug (opens in new tab)

(theorem.dev)

2ag88mo ago0

8

Tinker (opens in new tab)

(2b4fdb18.connectionism.pages.dev)

4ag88mo ago2

9

Training Qwen to answer briefly yet intelligently using feedback control (opens in new tab)

(runrl.com)

4ag89mo ago0

10

Launch HN: RunRL (YC X25) – Reinforcement learning as a service (opens in new tab)

(runrl.com)

71ag89mo ago22

11

Generating the Funniest Joke with RL (opens in new tab)

(runrl.com)

1ag81y ago0

12

Gravity Chess (opens in new tab)

(gravity-chess.andrew.gr)

2ag81y ago1

13

Please Show Lots of Digits (opens in new tab)

(dynomight.net)

32ag81y ago17

14

Implementing an SHA Transformer by Hand (opens in new tab)

(andrew.gr)

2ag82y ago0

15

Rebus: A Robust Evaluation Benchmark of Understanding Symbols (opens in new tab)

(arxiv.org)arXiv

1ag82y ago0

ag8

Recent submissions

Gourmand Syndrome (opens in new tab)

guys why does armenian completely break Claude (opens in new tab)

Sampling at negative temperature (opens in new tab)

Perfectly Replicating Coca Cola [video] (opens in new tab)

Po.ta.to (opens in new tab)

Scaling pretraining affects RL sample efficiency (opens in new tab)

Systematically generating tests that would have caught Anthropic's top‑K bug (opens in new tab)

Tinker (opens in new tab)

Training Qwen to answer briefly yet intelligently using feedback control (opens in new tab)

Launch HN: RunRL (YC X25) – Reinforcement learning as a service (opens in new tab)

Generating the Funniest Joke with RL (opens in new tab)

Gravity Chess (opens in new tab)

Please Show Lots of Digits (opens in new tab)

Implementing an SHA Transformer by Hand (opens in new tab)

Rebus: A Robust Evaluation Benchmark of Understanding Symbols (opens in new tab)

Recent submissions

Gourmand Syndrome (opens in new tab)

guys why does armenian completely break Claude (opens in new tab)

Sampling at negative temperature (opens in new tab)

Perfectly Replicating Coca Cola [video] (opens in new tab)

Po.ta.to (opens in new tab)

Scaling pretraining affects RL sample efficiency (opens in new tab)

Systematically generating tests that would have caught Anthropic's top‑K bug (opens in new tab)

Tinker (opens in new tab)

Training Qwen to answer briefly yet intelligently using feedback control (opens in new tab)

Launch HN: RunRL (YC X25) – Reinforcement learning as a service (opens in new tab)

Generating the Funniest Joke with RL (opens in new tab)

Gravity Chess (opens in new tab)

Please Show Lots of Digits (opens in new tab)

Implementing an SHA Transformer by Hand (opens in new tab)

Rebus: A Robust Evaluation Benchmark of Understanding Symbols (opens in new tab)