If you're generating speech once and replaying it many times (e.g. making podcasts), the difference is negligible and you might as well go with Eleven Labs, since it's more customizable and possibly slightly higher quality. If you're doing interactive speech with customers, $9/hr is incredibly expensive (higher than hiring a minimum-wage worker in the U.S.!), and OpenAI's TTS is a very close second best and much more reasonably priced. If you're trying to integrate speech into an AI product, Eleven makes your hourly costs pretty unfeasible since you have to at minimum charge your customers more than it costs to hire a human being to do a task.
Azure's "Neural" line of TTS is the best of the big cloud offerings, but it's pretty mediocre compared to either OpenAI or Eleven Labs IMO. And it's actually more expensive than using OpenAI: it's $0.80 for 50,000 characters (~1hr), unless you're willing to commit to over $1k monthly spend, at which point it's barely cheaper than OpenAI at $0.64 per 50k characters.
OpenAI's TTS is IMO the best option for anything interactive, since it's so much higher quality than Azure's Neural TTS and so much cheaper (with very little quality difference) as compared to Eleven Labs.