Tests (as usually written, in unit-test form) only tell you that it's not completely broken, they're not a good indicator of it working well otherwise "vibecoded slop" wouldn't be a thing. And the tests themselves are usually vibecoded too which doesn't help much in detecting issues off the happy path.
>you verify that your AI CEO is giving you the right information or planning its business strategy effectively
The same could be said for human CEOs. A lot of them don't really have good success rates either.