Does it even have to be able to do so? Just the ability to speed up exploration and validation based on what a human tells it to do is already enormously useful, depending on how much you can speed up those things, and how accurate it can be.
Too slow or too inaccurate and it'll have a strong slowdown factor. But once some threshold been reached, where it makes either of those things faster, I'd probably consider the whole thing "overall useful". Nut of course that isn't the full picture and ignoring all the tradeoffs is kind of cheating, there are more things to consider too as you mention.
I'm guessing we aren't quite over the threshold because it is still very young all things considered, although the ecosystem is already pretty big. I feel like generally things tend to grow beyond their usefulness initially, and we're at that stage right now, and people are shooting it all kind of directions to see what works or not.