Tried gpt5.5 and so far good. Zapier also shared an automation benchmark where 5.5 came on top in the leaderboard https://zapier.com/benchmarks
https://www.anthropic.com/engineering/april-23-postmortem