What I'd also add:
Because of the unspecified behaviour, you're always going to need someone technical that understands the output to verify it. Tests aren't enough
I'm not even sure if this is a net productivity benefit. I think it is? Some cases it's a clear win.. but definitely not always. You're reducing time coding and now putting extra into spec writing + review + verification