Indeed, voice matching to book "by some magic metric" is the way to go.
My comment wasn't entirely clear, it was the "voice matched to author gender" part that prompted a response.
In the great scheme of thing any voice reading aloud is an advance for people that require or like to hear books read out, improvements can come on a per book basis.
The end goal is likely a mix of bespoke readings by gifted voive readers (Fry) and guided "selectable AI voice" readings that can do can do clear and correct pronunciation and pacing with the voice of Jamie Erl Jonas (totally not James Earl Jones), Skarlat Johnson, or that Chipmunk character.