We Benchmarked Claude Code, Codex, Semgrep, CodeQL, Trent on 28 CWE-Bench CVEs (opens in new tab)

(trent.ai)

6 pointsgeopsist28d ago2 comments

2 comments

2 comments · 2 top-level

Looks interesting. LLM base solutions fails when metric is strict. For security solution guess is not enough, we need reliable and robust solution to pin vulnerability and its evidence to fully judge and mitigate with appropriate fix.

enothereska28d ago

I'm co-founder at trent.ai, happy to answer any questions around this.

j / k navigate · click thread line to collapse