We could have them tomorrow if we could magically ban human drivers overnight. Humans are the only thing making it so hard. And the problem is that until we have 100% artificial general intelligence, that can introspect and understand the human psyche the way another human can, there will always be an intractable tail of cases where AI will fail.
So it's like the IPv6 problem. If we could all coordinate at once, it would be easy-peasy. In reality, it's virtually impossible.
Edit: a commenter pointed out that pedestrians are also a major problem; even in a far-fetched imaginary scenario I don't know how you would remove those from the equation.