If you are working with AI to define the purpose and goal of the change -- which is to say planning how the changes to the code should result in some sort of feature/bugfix/whatever, then planning phase should ask you to define clear success conditions for the code that it writes. These could be otel/datadog metrics, or some kind of funnel metric or some cessation of errors in your APM, whatern. In any case the outcome of the change is what I mean by validate/verify. Mediocre code can solve issues and we can tolerate mediocre code in that sense. The guardrails kick back failing "mediocre" code, it accepts working mediocre code.
And this could easily apply to every change we made by hand before AI, it was just a tedious process to layer these things into code when we were just fixing bugs and whatnot. In an AI writes all the code world adding this kind of stuff as table stakes for a changeset is zero cost, effort wise.