while the "big data" (datasets) formed and thus owned by big-tech, big-ads, big-brother, etc. may be instrumental to build at-scale solutions for real-world usage (for profit, knowledge, control, whatever actionable goal),
fundamental research itself, as done in universities, can move forward without these datasets: using what's publicly available is enough.
Did I read this right? It would effectively add much needed nuance to the common perception that big data is necessary to train innovative models, that there might be some sort of monopoly on oil (data, the 'fuel' of ML) by a few champions of data collection.