The most common reason for spark use today is ETL+DataLakes (ie., cloud object stores and ETL in/out).
It seems actual analysis is happening in fast databases that receive data from the object stores.
can anyone here comment on this paradigm?