Use case example for Uber:
1. In 2011, a driver joined. They made a bunch of trips
2. In 2012, Uber added more detail about the trip. Information not collected for the 2011 trips.
3. And so on, each year there are 'just a few changes'
Given the above:
In 2016, Uber want to run a query to reward all drivers based on some piece of information that was only present in 2014 on.
At this point the historical trip information from 2011 is in a significantly different format than in 2016.
In a RDB, at least the old columns are there - or if the db was migrated to a new schema ( a pain ) the issue of the missing fields was addressed.
But dealing with data in old formats was an Uber pain. And the lack of visibility into just knowing the schema used to generate that JSON object is a PITA.
God forbid if you had new code that never even knew about the old 2011 format.
Lastly, what happens if a bug slips through and some JSON field is missing, has odd spelling ( capitalization wrong ), etc.
I would love to hear about how old data is handled in schemaless.
My experience with MongoDB was less than pleasant.
I guess someone at uber must really like mysql, a good enough reason as any other I suppose. I'd love to hear about what other reasons as to why mysql turned out to be the choice here, as I've usually gone the other way (mysql to pgsql) for many of the great features and performance pgsql has.
schema-less
...at first I read it as she-males.I have no idea why this word caught on instead of "aschematic", which is much easier to parse.
The implementation (using MySQL) seems very close to Vitess (http://vitess.io/overview/) which manages mysql as a series of "tablets", but exposes most MySQL features directly in the query language.
The advantage of MySQL in this situation is probably the support for multimaster replication.