> They're not interested in the blobs, but in the people accessing them
But if they don't know what the blobs are, how does this help them? What can they tie it to?
The latitude and longitude coordinates of my current location have an sha1 hash of 0950e97d3a2e4839e39ad27deb2e852d498100ae. Is this useful information?
In isolation, no. But given enough other actors that use the same firebase + can be tracked by adsense, they can infer connections useful for targeting you.
Don't you need a proprietary firebase SDK in your app to use it? Do you know what data are included in the requests? I would argue that anything as simple as IP + UA/OSidentifier can be of interest to Google
It depends what browser you're using and what parameters you're using on the XMLHttpRequest. They definitely get the IP, user agent (so what OS and browser), and potentially more.
At a minimum they could build cohorts of people who use your app and use that as a bit of information for ad targeting.
The SHA-1 thing is a complete non-sequitur, but since you asked, small amounts of data run through unsalted SHA-1 can be brute-forced very easily if someone cared find out where you are.
You don't need the SHA1 or anything within the blobs even. Just an IP address & user-agent pair is enough to uniquely identify a user with some accuracy, and that accuracy only goes up the more data you add. It'll never be 100% accurate, but for ad targeting, it doesn't need to be - a "hunch" is more than enough since getting it wrong leaves you no worse than you were before.
You’re not thinking on a big enough scale. No big tech company cares about “your” data. They care about everyone’s data in aggregate as much as they can get from every location at every granularity. Even thinking of a tech company as collecting all of “your” data to create an ad profile “for you” is rather inaccurate. They’re collecting everyone’s data to create an ad profile for everybody, tailored to what makes the most money in aggregate across all ad slots.
When you think of it like this you’ll stop asking questions like “what would they do with this piece of data?” Because the answer is always that it is a drop in a giant ocean of machine learning data.
If the requests are client side, they know that the user has accessed your domain. They can analyze the frequency and timestamps of these requests and add that information to the ad profile they have built for that user.