Browser Agent Benchmark: Comparing LLM models for web automation (opens in new tab)

(browser-use.com)

13 pointsMagMueller4mo ago5 comments

5 comments

4 comments · 2 top-level

pixel_popping4mo ago· 2 in thread

It's lacking the best model (Opus 4.5) on the benchmark tho.

Yeah but then their own product might not score the highest.

Exactly why I'm pointing it out, which feels a bit corrupt, but understandable.

Since we're in this topic, can anyone suggest good AI-based tool for exploratory (fuzzy?) web testing?

j / k navigate · click thread line to collapse