Plus after each run you get screen recordings with console logs, network requests, HARs, and Playwright traces so you can inspect exactly what the agent did :)
https://github.com/wizenheimer/canary
P.S. I attempted to do a Show HN but got flagged for some reason