Skip to content

Top Best Ask Show New Jobs

shahules | Better HN

shahules

131 karmaJoined October 6, 202146 submissions

Recent submissions

1

Cloning Bench: Evaluating AI Agents on Visual Website Cloning (opens in new tab)

(github.com)GitHub

2shahules2mo ago1

2

PA bench: Evaluating web agents on real world personal assistant workflows (opens in new tab)

(vibrantlabs.com)

38shahules4mo ago9

3

PA Bench: Evaluating Frontier Models on Multi-Tab Pa Tasks (opens in new tab)

(vibrantlabs.com)

7shahules4mo ago1

4

Show HN: Ragas – Open-source library for evaluating RAG pipelines (opens in new tab)

(github.com)GitHub

121shahules2y ago26

5

Show HN: Ragas – Open-source library for evals and testing RAG systems (opens in new tab)

(github.com)GitHub

15shahules2y ago9

6

Show HN: The rise of open source large language models (opens in new tab)

(explodinggradients.com)

5shahules3y ago0

7

Show HN: GPT4 vs. GPT3:What you should know (opens in new tab)

(explodinggradients.com)

2shahules3y ago0

8

Show HN: Open-source alternative to Adobe speech enhancer (opens in new tab)

(github.com)GitHub

3shahules3y ago0