Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
0 points
theshrike79
7d ago
0 comments
Save
Share
How would you benchmark "agent harness communicates with user clearly" it's 100% a feels measurement.
0 comments
1 comments · 1 top-level
top
newest
oldest
sanderjd
7d ago
I mean, in my experience some of this stuff is way closer to table stakes things than that. Like "the tool call didn't get totally confused" more than "did the communication with the user feel good".
j
/
k
navigate · click thread line to collapse