Skip to content

Top New Best Ask Show Jobs

lieret | Better HN

lieret

23 karmaJoined July 24, 202515 submissions

Recent submissions

1

Show HN: New Benchmark from SWE-bench team is 0% solved (opens in new tab)

(programbench.com)

23lieret4d ago3

2

Show HN: All the LM solutions on SWE-bench are bloated compared to humans (opens in new tab)

(twitter.com)

1lieret2mo ago0

3

Show HN: New eval from SWE-bench team evalutes LMs based on goals not tickets (opens in new tab)

(codeclash.ai)

5lieret6mo ago1

4

Show HN: Randomly switching between LMs at every step boosts SWE-bench score (opens in new tab)

(swebench.com)

5lieret8mo ago1

5

GPT-5 on SWE-bench: Cost and performance deep-dive (opens in new tab)

(mini-swe-agent.com)

4lieret9mo ago3

6

Show HN: New SWE-bench leaderboard compares LMs without fancy agent scaffolds (opens in new tab)

(swebench.com)

2lieret9mo ago0

7

Show HN: Mini-swe-agent achieves 65% on SWE-bench in 100 lines of python (opens in new tab)

(github.com)

7lieret9mo ago4