Using read replicas isn't always an option depending on your write throughput and consistency requirements.
The database can only do so well (and will spend a lot of CPU cycles working on your crazy query plan, because getting it wrong is more expensive, so now you effectively limit capacity regardless of how good your storage engine is).
Joins are great, tons of research went into making joins work, and lots of different join algorithms and optimizations based on data sizes, indexes, etc. But you really have to be careful, verses just denormalizing data across multiple tables/collections. Most applications are read-heavy, anyway... I generally plan for things to be successful, in which case joins don't usually work in the hot path.