Few people seem to try to reconcile this, since neither side cares about the other.
I personally think that discussion about fingerprinting as raw tech, without mentioning the size of the company collecting the date or the purpose is meaningless, and only leads to a few tech savy users having less data collected on them.
Most people want to use Javascript, use the default setting and not be afraid of clicking on links. I can't really see a good solution without a coordination of regulation and tech standards, so I'm hopeful at least for decent solutions.
Mouse movement data is a fairly potent fingerprinting vector. Bucketing the average spouse speed and acceleration rates could provide provide useful information. This may imply specific OS speed settings, or physical mouse DPI. A machine learning system would likely be able to distinguish traditional mouse, vs trackpoint, vs touchpad, vs trackball. Etc.
Also it is not just bots that have non-human like mouse movement. Many assistive technologies would have no mouse movement, or would auto snap the mouse to relevant spot. That is actually a quite powerful for fingerprinting, since assistive technology users are a pretty small subset of internet users, so only a relatively small amount of additional data is needed to uniquely fingerprint that user/machine.
Edit: The required FaaS implementation is trivial too. I could launch an endpoint that performs exactly this function in 30-60 minutes.
Totally agree that this is perfectly within the government's purview, and they should be doing something about it. But, as with anything else in the US, until a Fortune 100, some few 1%-ers, or the deep state MIC wants it, we're not going to be getting it.
I thought having an ad campaign that targeted subgroups very specifically and boldly might be enough drum up public interest. Something like: “Hello $name from $city. How did $recent_embarrasing_purchase work out? I hope you enjoy your birthday in $birth_month.” And then a link to the proposed policy.
Unfortunately, marketers have neither scruples nor the ability to control themselves and have captured an asymmetric advantage. Technologists do what they do, preoccupied with whether or not they could, not stopping to think if they should. It seems like legislation may be the only remaining option.
[1] https://signal.org/blog/the-instagram-ads-you-will-never-see...
Techie people are convinced non-techie people don't know they're being tracked. They do! Ask your smart non-techie friends what they think about online privacy. I guarantee you they'll say something like "yeah, I know it's probably tracking me, but whachya gonna do".
Thanks to this disconnect, we have so many privacy campaigns with a message like "Did you know you can be uniquely identified on the web?", but so few (none?) that actually proceed to explain why that's bad, and what someone could do with that information. That's the missing piece. Give average people an actual reason to dislike or fear tracking, not just the mere curio that it exists.
edit: It is still impressive. Even with the firefox settings on, the website was able to identify me. I am not entirely certain how I want to approach this.
None of these should be available to websites by default. The first two come from simpler times when people were not as concerned with privacy implications. The third has been and continues to be pushed by advertising companies (Google, Apple, Microsoft).
So quick update since I am mildly obsessive.
I was sure it was either GPU, CPU or addons that were giving me away ( I do have a mildly unique setup ).
I ran few tests in VM and the moment I dropped GPU passthrough ( left CPU passthrough ), I was no longer ( based on that website anyway ) tracked across sessions.
In other words, cat and mouse game continues.
I know what to think about this… I fucking hate it.
FTFY: People already know; nothing will change.
Many of the things that are happening (at least in the US) are deeply, deeply unpopular, but are not changing, and show no signs that they are even susceptible to change. Fortune-sized companies, the 1%, and the deep state are calling the shots, despite how much can be seen in real time, through things like Twitter and TikTok. I've actually had to pull back from Twitter because of all the things that are obviously beyond the pale, yet will never change. (Snowden, Assange, et. al.)
This has been tried by a guy who placed Facebook ads like these. FB blocked his account in a few hours.
So good in theory, wont work in practice
People are such dumb fucking cattle that they'll lash out at you rather than the data brokers or the software vendors who ratted them out though
Not only that, but they might have a legal case against you. I've been slowly working through Seek and Hide: The Tangled History of the Right to Privacy, and my main takeaways have been:
(1) The constitutional right to free speech and a free press is not as broad as most people probably think.
(2) Truth is not necessarily an air-tight defense in a case of libel, as courts at various times and places have decided against publishers for true but embarrassing things intended to humiliate or harm.
I have five different browsers on my smartphone and three on the PC all sans JS and none of them are Chrome. Also, normal operation is to automatically delete all cookies at session's end.
My smartphone and PCs are de-googleized and firewalled and I never see ads in my browsers nor in apps. The apps are mainly from F-Droid and sans ads and the few Playstore ones I use are via Aurora Store and are firewalled from the internet when in use. Honestly, I cannot remember when I last saw an app display an ad, it has to be years back.
In the past I used to go to more extensive measures to stop the spying but I found it was unnecessary as the spy leakage was essentially negligible with much less stringent efforts.
It's pretty easy to render one's online personal data essentially wothlesss if one wants to. On the other hand if you insist on using JS, Gmail, Google search, Facebook etc. then you're fair game and you only have yourself to blame if your personal data is stolen.
They describe their approach[1]. They use HTTP headers and conditional request triggered by CSS conditional media queries to gather data. Something like @media(...) {background: url(/tracking/$clientid)}. But in principle, they could also try and fingerprint the TCP/IP stack or the TLS implementation. I'm not sure it would get them more data than OS+Browser, though.
[0] https://noscriptfingerprint.com/
[1] https://fingerprint.com/blog/disabling-javascript-wont-stop-...
I didn't detail every protection I've put in place or the post would have been too long. However, I'd suggest that spreading my browsing over at least eight browsers (and I actually use more than two machines and do so at different locations and with different ISPs) effectively reduces my profile across the net.
I also use randomized browser user agents and clean links, occasionally I'll even cut-and-paste links between multiple browsers in a single session. I often do this on HN not to hide from HN but for convenience when multitasking. (Having worked in surveillance professionally, this modus operandi just comes naturally, it's now second nature for me to work this way.)
Working with multiple browsers and multiple machines also solves the problem when on rare occasions I have to use JS. That said, I never watch YouTube with a JS-enabled browser, instead I'll use NewPipe or similar. There are other measures I could list but you get the idea. Oh, and I never use the internet on a smartphone with a SIM enabled, instead the SIM resides in a separate portable router and my 'real' phone is a dumb feature phone, it's only capable of making phone calls.
I really don't care if some stuff leaks but I've satisfied myself it's pretty trivial, as frankly, I've not had one indication over the past 20 or so years that I've been targeted as a result of fingerprinting. It's not necessary to make things completely watertight, I'm not trying to hide from the NSA or GCHQ, etc. (and it'd be unsuccessful and a complete waste of time to bother trying).
Moreover, even if something were to leak, I'm simply not a revenue-making target—that means I never respond to any targeted marketing because I simply never receive any.
I also notice that the no-JS hash changes when I move the window to a different monitor.
For example: I have set up the systems of family members for whom i am some sort of digital janitor with a nice collection of firefox plugins to get rid of the worst offenders.
If you continue to willingly use socials like FB, TikTok, et al, your complaints about stolen personal data fall on deaf ears. Show me that you don't have those apps installed or do not visit their websites, then we can talk about being serious on deserving to not have data stolen.
Right, it probably is. But the issue of stolen personal data has been around for so long that nontechnical people have had years to develop political lobbying and to swing elections to put a stop to it.
The fact is that most people don't give a damn about such matters, if most did then the problems would be behind us by now.
Thus, unfortunately, with the internet it's every man and woman for him or herself. QED!
At this point I'm 100% OK with us being the only ones able to protect ourselves. We warned them and they didn't care. Allow them to remain uncaring. We don't have to help everyone. People must want to be helped.
Nobody said that. "My defenses work" != "my defenses should be necessary".
Examples include the back button, uploading photos on some websites uploads random data instead of the photo, etc.
Surely there could be valid reasons for doing so?
I imagine for example that:
1. It ensures the selected file is a valid image before uploading it
2. It strips meta data like GPS position from the image before uploading it
3. It could reduce the size of the image, by either scaling it down, or compressing it more, or both, before uploading it
Or it might not be strictly necessary, but Instagram does it anyway.
No, this is how most pre-upload image editors work. Why upload a 5MB avatar photo that's you're going to have the user crop and scale on the client-side to a few hundred KB first?
Using canvas for this is much more friendly to their bandwidth, no nefarious intent needed.
(I'm guessing it was too much implementation work to separate out this feature: to preserve normal, expected UI behavior client-side, while presenting a fake pagezoom value to scripts. That would degrade only a handful of (poorly-designed, script-layout) websites, rather than the whole accessible browser experience).
I really like the idea behind this feature, but it seems the Web API might have become too complex to counteract bad actors like this. It's particularly scary that it can correlate your activity in private mode with your identity in normal mode.
_______________________
0. https://www.bleepingcomputer.com/news/security/researchers-u...
> For personal reasons, I do not browse the web from my computer. (I also have not net connection much of the time.) To look at page I send mail to a demon which runs wget and mails the page back to me. It is very efficient use of my time, but it is slow in real time.
It was in the early 2000, and smartphones weren't a thing. It also was a time where companies were paranoid into letting employees access the internet, but at the same time had abysmal security. By that I mean viruses ran free on shared folders, undetected because their antivirus software was years outdated. Very different times...
"Mail for you, sir!"
EDIT: Note that you can do BOTH - but one without the other is just a game of whack-a-mole.
Granted, the enforcement should be stepped up.
Example please
The burden of proof is on the claimant, and with proper information control you can't ever meet that burden of proof. It becomes an ant versus a gorilla instead of David vs. Goliath.
Tell me, how do you differentiate a simple random alpha-numeric string from another random string that may have been generated as a fingerprint.
Mathematically do you think there's any way to actually prove one way or the other? If not, how would that bias the system if the person is adversarial and lies.
The only way to prevent this is to make sure the information is nonsensical.
Preventing collection would identify you in a way that they can prevent access. Even though websites are public, you see this happening with any captcha service.
Might be my European outlook, but consumer law has been stupidly effective at curbing abuses from companies here and was much more effective than playing the technology race USA is trying to fight. There's always a next side-step, the next abuse a company can invent - and you keep trying to push the responsibility of avoiding it to users (by adding more and more onerous technology) instead of punishing the abusers.
FTFY?
I use fingerprinting actively in enterprise apps as a form of silent 3FA. It's a useful backstop. If I have a user who forgot their password but retrieves it via email, I'll usually let them pass if their fingerprint matches one of their priors; otherwise my software shoots off an email to their immediate superior to make that manager validate that the machine the employee is using is one they can vouch for.
I've always viewed browser fingerprinting as something that can be leveraged as a security feature. It's far more useful for that than for some sort of distributed tracking. I'd never want to live in a world (ahem ... China) where submitting to such fingerprinting actively was mandatory, or politically punishable if you didn't. No society should be run like an employer/employee organization with that sort of lack of trust. No sane free person would allow their own browser to transmit a fingerprint. But for employer/employee systems management? It's a great tool in the box.
[edit] also, the less trivial it is, the better for corporate security.
That said, fingerprinting is only useful as a third security measure because most people don't understand its mechanics. The mechanics of avoiding being tracked are pretty basic. If our country required browsers or computers to transmit their fingerprint, people would find ways around it and it would stop being useful as a security metric.
Put another way, the moment this becomes a feature of an oppressive regime, it's one of the easiest things to work around. The obscurity is what makes it remain somewhat useful.
(2) The server gets sent your password every time you log in. You shouldn't rely on a server operator not knowing your password.
(3) You can tune how sensative the system is in response to changes in the fingerprint. Even if their in a failure to match that just means authentication will be extra strict.
* https://addons.mozilla.org/de/firefox/addon/canvasblocker/
which prevents fingerprinting via Canvas elements, additionally warns you if a site does it. There are more sites out there than you would assume. Some stupid blogs even.
* https://addons.mozilla.org/en-US/firefox/addon/multi-account...
This splits your tabs into different categories, each with their own cookie storage.
The fingerprinting website in the article didn't manage to correlate me visiting the website concurrently from two distinct container tabs.
But that's merely because of the canvasblocker (or something else you have), because just separate containers doesn't cut it?
For instance it claims iOS is 4.63% of users and Safari is 3,42% when all other more complete statistical sources put those numbers at closer to 20%-30%.
If it changed with every call they'd just block you as a bot.
The more you do to prevent fingerprinting the more you hobble the web as a platform. A lot of restrictions that got placed on the canvas tag to help prevent fingerprinting for instance really limited its functionality.
In my opinion a workable solution would be to make more of these things opt-in by the end user to high accuracy data for the page.
There is a price, of course. Lying about screen resolution might mess up how the website looks. Lying about which fonts are installed might make the site a bit uglier.
> The IPv6 Privacy Extension is defined in RFC 4941. It is a format defining temporary addresses that change in regular time intervals; successive addresses appear unrelated to each other for outsiders and are a means of protection against address correlation. Their regular change is independent from the network prefix; this way, they protect against tracking of movement as well as against temporal correlation.
The false positive and negative rates are reasonable, and false positives (new customer seen as returning) could be further reduced by browser feature testing.
For example
chromium-browser --user-data-dir=/tmp/profile_A
chromium-browser --user-data-dir=/tmp/profile_A --incognito
chromium-browser --user-data-dir=/tmp/profile_B
chromium-browser --user-data-dir=/tmp/profile_B --incognito
For each command + its incognito it can detect them as separate profiles.
For ultimate privacy one needs to everytime launch browser with a new profile.
Atleast they can use this to prevent reCaptcha - and make passwords disappear!
Public knowledge is far behind the actual capabilities in practice.
The worry would be that the hash is unique to me (i.e. a fingerprint), but I don't see the evidence that it is.
So if I fingerprint you on a site which is using my commercial fingerprint service, then I can sell your hash to other places and tell them all about your browsing habits. The more places run my fingerprinting service, the more data I can collect on you.
The worst part of this? Trying to hide from fingerprinting makes your fingerprint more unique
However, I doubt that's a problem in practice. I'd assume these finger printers know what they're doing. It certainly seems so.
How could one make an experiment collecting lots of these finger prints and determine the false positive rate?
It matters more how unique your fingerprint is than how consistent or reproducible it is. Just testing if you get the same fingerprint back on your second visit doesn't tell you much if you don't know how many people "share" your fingerprint.
As a silly example, if you gave all users the same fingerprint, it would be very consistent but also useless as a tracking method.
You'd have to mask every informational API with a suitable corrupted alternative that is plausible.
https://coveryourtracks.eff.org/
I use a lot of browser extensions. Unfortunately, this makes my browser easily identifiable.
Is it possible you had set the zoom level previously, which the browser remembers between sessions, and turning off the tracking reset the zoom to 100%? Do you have any extensions like Greasemonkey or Stylus for per-site customisation?
The fingerprinting discussion is relatively new. The first research paper’s author is only 35 or so. (Its title is Cookie Monster.) The discussion is also a little amusing on a site like Hacker News. A perfect example of someone who’s easy to fingerprint is someone who built their own computer (likely to be found on HN). On the opposite end of the spectrum, Safari iPhone users with the same model are impossible to distinguish.
There’s a paper out there where the researchers worked with a public entity’s website to get more accurate fingerprinting data. There are very few unique fingerprints in reality and therefore no reason for any company to track them. This tech probably won’t ever identify users uniquely.
There are actually some positive aspects of fingerprinting. Tor leaves a very obvious fingerprint, and it’s easy for banks to detect its use by criminals.
Given how companies have completely and utterly ignored the idea of consent banners, I am deeply disinclined to believe that most companies would ever actually be satisfied with a user controlled choice in the matter. Where we actually are is that companies will relentlessly attempt to stalk everyone all the time no matter what, and in face of that the only sane conclusion is that in practice tracking is bad and we should shut it all down.
This isn't a situation where one rotten apple spoils the whole bunch, it's more like one good apple inadvertently was dropped into a toxic cesspool of rotten apples.
This makes it exceedingly hard to hide from such a filter, because in communicating with these sites, you are bound to reveal at least some information about yourself. And then the "likelihood-machine" does the rest by connecting the dots, even if you gave them "fewer dots."
It's also quite interesting - or perhaps chilling - to see how fingerprinting through NLP and other language tracking algorithms can also track just about any forum post you do, even if you're using a pseudonym.
There are three options:
1. Prevent/Stop it: This ship sailed long ago. Not to be grim about it but pandoras box got opened.
2. Fight it: Tool up, change your print, your behavior, your place. Build focused VM's that you use per topic. Simply do a WHOLE lot less. In the grand scheme, its a lot of work for low return. Note: there are exceptions.
3. Increase Noise: The whole point of most data collection is to sell more to you. Because most people are sheep, a fairly simple model can be surprisingly accurate (over targeting is an issue). Don't be a sheep, diversify, make more noise in the system, search out side your comfort zone and change it up often.
I use a text based browser, with no js, no cookies, no css, no external requests past the first html page download, no user agent, no etag, I connect through Tor and I've modified the browser to randomize http headers. And of course, it sometimes happens that I want to see something that is refused to me with that configuration (like, seeing anything behind the big internet killer, aka Cloudflare - thanks archive.org for existing), so I have also a classic browser for the occasional lowering of barrier.
At first, I thought fingerprint.com did identify it, giving me the hZ4W5oQ7pJVIHbW2fBXA id. Then I realized it was giving the same id when using curl with and without Tor. Then I realized, by googling and ddging that id that it's the one reported as well to search engines. So it's not unique and it's basically a "dunno" reply.
The zoom settings in the display/brightness section of the iphone seem quite relevant for fingerprint.com algorithm.
Toggling between standard/bigger text toggles the fingerprint value.
This could be because the visible area in the screen size changes, as well as some value of the CSS-fingerprint.
- Firefox, Enhanced Tracking Protection ON
- Multi-Account Containers + Temporary Containers addon
- Privacy Settings addon, most settings private, but referrers enabled
- uBO with lots enabled, Decentraleyes addon
And it probably understates the problem these days, missing some of the more recent techniques.
But at the time, it was considered to be a big do not touch -- just don't do this. Not so much for ethical reasons, but for optics in the industry. (I wasn't proposing doing it, was just curious)
In the meantime, though, this seems to have just become standard practice, but way more sophisticated with way higher accuracy, as this article touches on.
What was not acceptable a decade ago is now "ok." Not just by sketchy ad startups, but by major players.
But this whole mess ties back to one of the things that worries me the most about the propagation of LLM type ML out into the general industry. It's only a matter of time before ad targeting takes on an extra dimension of creepiness through this (and I'm sure it's already happening in some aspects, inside Google & Meta.)
In the past, in ad tech & search, etc. people could say things like: "Yes, it's highly targeted. Yes we've co-related an absolutely huge quantity of data to fingerprint you exactly, and retarget you. But it's anonymized. No humans saw your personal data. It's just statistics.". Not saying whether or not this argument has merit or not, just repeating it.
But now, here we are, where "just statistics" is a far more intricate learning model. One which is capable not just of corelating your purchases and browsing activity, but of "understanding" you, and which -- while not an AGI -- is pretty damn smart.
At what point does "a computer scanned your browsing for patterns and recommend this TV set" become ethically the same as "a human read your logs, and would like to talk to you about television sets..."?
Having worked in ad-tech before (and having worked at Google, in ads and other things as well), I do not trust the people in that industry to make the right decisions here.
Perhaps you could call this something like 'cross-device fingerprint unification', idk.
The demo delivered an ad-unit on mobile after viewing an ad-unit on TV.
(0) https://www.bleepingcomputer.com/news/security/researchers-u...
Fingerprinting services tries to figure out browsing settings. Since very few people have this feature enabled. You might be easier to fingerprint by enabling it. A metric that historically been used for fingerprinting is the "do not track" feature which is a bit of irony.
Say I follow AS Monaco football, then look for Lego Castle figurines and finally visit a forum on Alaskan Malamute dogs. The combination of these three websites is pretty close to unique in the world imho.
Surely most people can be uniquely identified after visiting a couple more, unless we change browser and ip-address and GPU and set resistFingerprinting=true and ... and clear cookies after every website we visit.
There is a bug in Chorme, which I reported, but they told me they will not fix it: https://bugs.chromium.org/p/chromium/issues/detail?id=120485...
And https://www.amiunique.org/ says I’m unique in Brave compared to “nearly” in Safari haha
privacy.trackingprotection.fingerprinting.enabled
This would make sense since messing with values for the root frame could cause unwanted side effects, but you're not likely to care if some iframe gets your screen resolution or CPU count wrong.Adding the extensions `Canvasblocker` and `Temporariy Containers` did solve the issue though.
I only use Chrome to test some things, or to create a completely isolated browser session disconnected from my use of Firefox.
https://niespodd.github.io/webrtc-local-ip-leak/ still? leaks local IP in mobile safari. On browserleaks local ip check fails, giving false feeling of safety.
You can experiment there: https://coveryourtracks.eff.org
Tracking should be limited with legal means.
EDIT: Or block the extraction
- Since you charge per person, what about people that use multiple machines and browsers (with presumably different fingerprints)? - On the other hand, unless two people share the same workstation and computer account, how do you expect to use fingerprints to detect license abuse?
https://www.eff.org/deeplinks/2018/06/gdpr-and-browser-finge...
<body onload="javascript.disable()">
Those same two companies effectively control the browser market. If there's political will in Europe, they can be forced to implement working privacy controls.
GDPR. isn't. about. browsers.
> But there would not have been money to make for those that provide "compliant" banners.
Are you serious? Do you think WordPress addon makers lobbied GDPR through the European parlament?
EDIT: nope, not as implemented in Chrome https://www.jefftk.com/test/webmidi
Edit: I think MDN confirms this, with the asterisk next to Firefox: https://developer.mozilla.org/en-US/docs/Web/API/Web_MIDI_AP...
Edit 2: oh, the tweet shows two prompts, one of them to install the extension, so I suppose that is actually the prompt you're referring to.
For me, the cookie consent modals are the submarines. Why would I outsource the responsibility not to track me to the people with the incentive to track me? IDCAC, Cookie Autodelete, and strict tracking protection feels like the better alternative for me.
(From today onwards, I'll add resistFingerprinting=true to that list as well.)
Nah. I make an HTTP request and I get a response. That's how the web works. Perhaps people can have different opinions on "how the web works".
Web fingerprinting relies on a heap of assumptions. For example, that someone uses a web browser to make HTTP requests, that the web browser sends certain HTTP headers in a certain order, that the web browser runs Javascript, that it processes cookies, recognises HSTS response headers, and so on and so on.
If all the assumptions are true, maybe web fingerprinting is effective. But if the assumptions fail, maybe web fingerprinting does not work so well.
I have only ever read blog posts about web fingerprinting that take all the assumptions as true.
The majority of traffic on the internet is said to be "bots". Not web browsers running Javascript, processing cookies, and so on.
It seems to me that someone should discuss what happens when the assumptions fail.
Do advertisers care about computer users who do not use graphical browsers much. As such a user, IME, the answer is no.
(Interesting to see how defensive replies get. It's obvious the "tech" crowd intent to spy on web users is heavily reliant on certain assumptions to remain true forever. It shows that there is necessary pressure to keep web users using a "preferred" web browser and web ""features" that will subject them to "web fingerprinting". Perhaps the assumptions will always be true, conditions will never change, in the same way that interest rates could never change.)
even the simplest bots nowadays can run Javascript and process cookies. What's much harder for a bot (or some other actor that has been doing shady things across many websites) to uniquely fake are things like the graphics card (WebGL Vendor & Renderer), audio and other hardware, which gets queried during fingerprinting.
Full fingerprinting is relatively expensive, so it originally was used by fintechs to combat fraudulent/automated signups, but with the third-party cookie situation it might be already economical to track regular users for ads/retargeting.
GPT-3 was trained on a filtered version of CommonCrawl.
IMO, this is text-only web use. No (fingerprint-friendly) graphical web browser needed. Others may have different opinions. Perhaps I am biased as I use the web this way seven days a week.
Almost nobody does this, so obviously not. You're probably in a group that makes up less than 0.0001% of web users. And that might even be generous.