I can't even say how many users this site has now. It could be the same user coming back over and over. Or many users. How would I know.
Yes, I could track a ton of stats about every pageview like user agent, screen resolution etc and then try to stitch it back together. Trying to figure out how many different users there are. But this type of "stitching together" would probably also count as PII.
I cannot test new features and see if it makes users happy so they come back more often.
I cannot see if the site has issues on some hardware, software, language. Because I wouldn't see if users affected come back less often.
I can't test if an introduction text at the beginning helps users discover important features. Because I can't make the connection between showing the text early on in the user journey and usage of features later on. Because I can't see a user journey.
This is a site I run for the enjoyment of me and the other users. Probably a few thousand a month. And I can say the site was much easier to develop with a normal cookie approach to tracking. I have gone the cookieless way for about a year now. And I can say with certainty that it would be a better site now if I kept the cookie approach. When the developer flies blind, that's bad for everybody.
I think for a commercial site, where a degradation of 10% in user experience can tank the business, there is no way around cookie tracking to figure out what works for users and what doesn't.
This is another reason, why European internet companies do not stand a snowball's chance in hell to compete with their US competitors.
European companies need to bug all users and beg for cookies. While US companies only need to do that with their European users.
It's nice to be able to A/B test your blog or product. It's cool and efficient, but it also hurts and to me that hurt outweighs your company's marginal benefits although it is a nuanced and difficult discussion to be fair. I'm being simplistic here to make the point clear. I hope.
How is simple analytics or A/B testing that's NOT internet-wide tracking (that is, only for the website you're on) or sold (which would be outright illegal without explicit consent) hurting you? Genuine question, because I don't see it. Internet-wide tracking across many sites: sure. But that's a very different thing – it's the difference between "I'm home Darling, I saw Sander at the mall today" vs. "Hello everyone, here is everything Sander did this week".
You can set your browser to not store cookies at all.
Or to discard cookies when you close it.
Or you can delete cookies whenever you feel like.
You can talk to your users in person and ask them, or poll them via email. Do usability tests etc. I guess it just costs more.
Other industries have to do this, they can't just default-spy on their customers.
Marketing survey emails are infinitely more annoying than cookies.
doesn't scale.
> poll them via email
doesn't work.
> default-spy on their customers
analytics is not spying.
Every company claims that they are doing this stuff to make things better for users, and it never is.
You mean back when your average website was hot garbage? The yearning for the days of static HTML pages is childish atavism
We do use cookies for part of the analysis, but they're not unique user-tracking cookies. Instead there's one that simply tracks a self-reported datestamp on when the user last visited, which looks something like:
WMF-Last-Access=2023-01-01,Expires=Wed, 01 Feb 2023 12:00:00 GMT
(we send a Set-Cookie for this with the current datestamp only to the 1-day accuracy, which expires ~32 days after it's set (but rounded to 12-hour accuracy), and is replaced constantly).
There's another more-recent one we use that's explicitly about differential privacy, which send back info on the hashes of the 10 most recent URLs you've visited on the site, IIRC. None of them are unique tracker hashes for a given user, though.
That seems highly unlikely to never be unique for some users.
> "Because I can't see a user journey."
Today's worst offender was trying to follow an invite and register an account within a company account, it took me two goes through the signup form, two attempts to edit my profile, trying to login to three different subdomains, one useless search of their documentation site, three error messages, two rounds of asking my coworker to check their admin side, before I saw the right thing to do. "You" (big silicon valley company) don't need tracking cookies to "see a user journey" or to tell you that "I really love your site because I keep coming back to it and looking around", you need to grab someone in a hallway and push them through the workflow and watch them fall on their face over and over.
> "I think for a commercial site, where a degradation of 10% in user experience can tank the business"
Imagine how amazing websites would be if that were at all true. Have you seen the user experience of Amazon? or eBay? or Facebook? or 'new' GMail or new Reddit or non-websites like Teams or WebEx?
Hang on, it's not bad for me. If I visit your site, it's not because I want to participate in some kind of A/B testing (whose results you'll never tell me about). And if your site only works if I happen to have hardware X installed, you don't need analytics to tell you that your site is broken.
> European companies need to bug all users and beg for cookies.
That's nonsense. If cookies are needed for the correct operation of the site, then there's no need to beg or bug.
So your banner should be saying: "Can I please set tracking cookies that make no difference at all to the correct operation of the site? [Accept] [Reject]". Then count the number that say Accept, note that the number is approximately zero, and then scrap the banner.
Nobody wants to be part of the A/B testing; everyone wants the polished product that's the result of A/B testing.
> And if your site only works if I happen to have hardware X installed, you don't need analytics to tell you that your site is broken.
What? Yes you do, or something similar.
> If cookies are needed for the correct operation of the site, then there's no need to beg or bug.
Strictly needed. Suppose I have a dark theme slider, or a language selector, for users who don't want me to follow their OS's settings (or can't or don't know how to change their OS's settings). That's a nonessential cookie which requires the banner.
And if you disagree, your opinion is not worth a lawsuit; the only way to be sure that your use of cookies is limited to those strictly necessary is hire a very expensive european consultant.
If you didn't pay, I'm not sure why you feel entitled to anything.
Why don't you just hash the IP-address and count unique users that way?
You can store information client-side, without sending them over network, but randomly send digests back to your server.
For instance you can store a counter of the times the user went to visit the website, and randomly with a 1% probability send that counter to your server. (It's better to make it random, because if you send every +=1 you would end up being able to track users).
At my work, I do a lot of statistics of user usage, but I always work to do my best not to leak PII. I'm not a security or privacy researcher, so my work is probably not great but still, the way I do it I believe is largely private:
- No unique ID sent, but a daily digest (some people send every single event to their statistics server, and thus need a unique ID to know how many time one person did one action. With a digest that already counts the actions there is no need)
- bucketized persistent data: for example the available storage size of the device the app is running on. Sending precise value would make it easier to track digests from one day to the other and track users
- For booleans, add some white noise (because 20 booleans is enough to identify someone)
- For open-ended information (for instance the list of countries contacted by your SMS app), booleanize it (one boolean per country, cf previous line), and maybe keep a counter to know how many you didn't take into account to know whether you're still missing a lot.
Yes overall doing it with no PII requires much more work, but then Big Tech (and smaller techs like Clearview) clearly showed that any PII can and WILL be used against their users. The best way to never leak user's data remains to never have them in the first place.
My point is: You have to draw the line somewhere (and I think the GDPR line is very reasonable). If you are a business relying on having a perfect website, you can use other means like UX labs.
Users coming back more often does not imply that they are happy. There’s no substitute for actual user studies.
> I cannot test new features and see if it makes users happy so they come back more often.
> I cannot see if the site has issues on some hardware, software, language. Because I wouldn't see if users affected come back less often.
> I can't test if an introduction text at the beginning helps users discover important features. Because I can't make the connection between showing the text early on in the user journey and usage of features later on. Because I can't see a user journey.
Why don't you just ask your users about these things ?
[0]: https://ico.org.uk/for-organisations/guide-to-data-protectio...
Block third party cookies by default, delete other cookies on the last tab or window closed and prompt user to save cookies on a form submit ("do not delete cookies for this domain when leaving" type of prompt, for pages with logins, settings, etc).
Also remove features that make easy fingerprinting possible, the site doesn't need to know every font I have installed, just have a "standard set" included with the browser, and use web fonts or whatever for other fonts.
(for amusement, google P3P and see what comes up first.)
There's also Privacy Badger, from the EFF. This turned up that the "Who wants to be tracked" site is using "plausible.io" to track visitors.
Is this (reasonably) possible? If you ever do a graphics project you'll find that it is pretty difficult to get a pixel perfect render. Hence canvas fingerprinting[0]. (same happens for text rendering)
The problem can come down to the silicon lottery as well as the browser[1]. If you render the same code on two different machines, with the same compiler and libraries, you won't get pixel perfect difference. And GPUs don't match CPUs, though can if you edit the FMA instructions. Usually the best way to ensure images are exactly the same is to compare between Macs because the hardware is very similar and similarly binned. So Macs tend to have lower fingerprints in general (to the best of my knowledge) Current canvas blockers tend to just return a value, but that can obviously be a fingerprint itself.
So my question is if this is reasonably solved? I don't see it by being just the browser unless they can specifically just block a lot of that tracking. Which may require a big actor like Google or Apple to make a stand.
(Note: not an expert, but have done a decent amount of visualization work)
[0] https://privacycheck.sec.lrz.de/active/fp_c/fp_canvas.html
[1] https://stackoverflow.com/questions/47696956/display-pixel-p...
This is making it much less acceptable to allow first-party cookies and scripts.
Websites want you to explicitly reject consent for using your data. There are many untrustworthy places on the internet, so trustworthy places want to explicitly ask because they don't want the user to reject consent to them just because they are being lumped with the other sites.
Now if you had a browser setting that gave consent by default and allowed users to deny consent that is something that would reduce the need of sites constantly asking you for consent. But any attempt to automatically reject consent is just going to result in sites asking for consent via another channel. See what happened to the do not track header when you try and lump all sites together.
Did you forgot what happened the last time 'the browser was handling it'? I would remind you: Internet Explorer, Do Not Track.
Server owners can track you with the data they collect, client side can have little control of that. I wonder if there’s a better way possible in the current iteration of the web platform, or if a substantial overhaul is necessary for privacy respecting services.
Perhaps DNT would’ve been more effective if it was written in law? To gather user information for marketing purposes, you must respect this header. But then, how’d you enforce that… other problems: if browsers always set DNT to true by default, then the whole effort is pointless, because nobody will opt-in. This is ideal, but marketers will definitely not like that idea. Thinking out loud here.
EDIT: found an interesting HN thread from 2017 https://news.ycombinator.com/item?id=14377877
2. Your proposal gives a pretty crummy experience in cases where users do expect the site to store data on the client longer. For logins you're popping up a confusing banner, and for client-side only storage (shopping carts, preferences, work in progress) you're silently discarding people's work.
An alternative would be to have a cookie jar per domain.
I never turn private mode off. I wish firefox and chrome also worked this way.
Browsers can (and do in the case of Firefox and other privacy respecting browsers) try to make it harder to track you, but it's not something they can just unilaterally turn on or off.
Consent dialogs are about what sites do with the information they get about you, not just about what information they get.
We had Do-Not-Track header once. Id did not play out very well.
It may have had good intentions as entities such as Microsoft would just set their browsers to default to accept all cookies anyway and a marginal group of people would know how to turn it off and even they couldn't still be sure if a proprietary browser still accepts and sends cookies without just telling the user.
So as usual the good intentions have turned into a cat-and-mouse game in the technical, grassroot realm. There are browser extensions that will just kill these consent dialogs automatically and websites try (luckily, not very hard so far), to work around the kill scripts. Everybody suffers.
No such regulation has ever existed in the EU.
Yep, i'm just not sure if it was general incompetence or just plain lobbying by third parties to do so.
Some popular browsers are supposedly "open source" yet it appears no browser user longing for saner times has ever tried reversed this change and recompiling the browser for their own use. No third party cookies by default.
The most fascinating thing IMO about so-called "modern" browsers is that even when their vendors publish source code, "99.9%" of people will not even attempt to make changes, even something as simple as changing a default from "on" to "off". It's like the software is stamped with "Read Only" or "Do Not Touch" and "99.9%" of people dutifully obey. The "0.1%" appear to be very conservative with the changes they make.
For example, if it was possible to disable auto-loading of resources, I might actually use these "modern" graphical browsers for tasks other than commercially-oriented transactions. Cookies are only one problem with these browsers.
One project I was responsible for was the development of a distributed file system for AIX. The goal was a distributed file system that addressed some of the weaknesses found in other distributed file systems at the time. Our chief competitor was Sun's NFS distributed file system. NFS was a really nice design. It was well integrated into the operating system and quite reliable because it utilized a (mostly) stateless server. This had a number of performance and security implications along with some file system semantics over NFS that didn't match local file system semantics. We wanted to introduce state for the server to address these issues and thought of a number of complex protocols to manage it in the presence of unreliable clients. That's when I thought up the idea of making the clients keep their own state to be restored when they reconnected to the server. I protected this state from manipulation by the client by encrypting it. I didn't call them cookies, I called them tokens.
This design was patented by IBM and I was one of the two inventors on the patent. This patent was owned by IBM and years later they gave a special award for this patent because it decided that it was one of IBM's most important patents. (They wouldn't have done this unless the patent had held up to scrutiny or legal challenges). Unfortunately, by that time I had already left IBM to start my own company--I was at the top of my game and had confidence that I could create a software product of some kind that would be successful--so I missed out on the financial award for the patent. By then, I was at my new company and already in competition with IBM.
By now, the patent should be long expired. Interestingly, IBM ended up buying my company around seven years after I and a partner started it.
I was very aware of the academic literature and industrial practice during this time so I do believe that my invention does reflect original work that ended up with a very significant impact.
From a more personal perspective, the invention didn't financially benefit me. The work that I did at my company own was more creative, inventive, technically impactful, and financially important to me. For example, Austin Ventures has indicated that my company was the start of Austin becoming an important high-tech location, but none of that was related to the cookie.
In the end, it's definitely been used in excessive and somewhat intrusive ways. I also wish that browsers had better controls over second and third party cookies and tracking (similar for nested iframe) in order to bubble some of it closer to the surface. In the end, pihole, ublock origin and privacy badger goes a long way to limiting this.
What you are objecting to is specifically the part of that law which mandates consent for gathering personal data, and which resulted in these cookie banners. That's annoying (mostly because a lot of media companies rather continue hovering up data instead of critically assessing the need to do so), but it doesn't invalidate the better parts of the GDPR. Calling it a complete failure is unnecessarily hyperbolic.
What is unfortunate that law puts the onus on implementing the dialog for individual sites, it should be feature of browsers. This way users could enforce from browser settings that they don't want to be tracked.
And the consent banners are industry's own inventions. They could have honored do-not-track and be done with it. Instead, they opted for these dark patterns.
[Accept all tracking] [Ritual Hazing]
Because that's really the question here. Would I prefer everything in my life was free with no strings attached? Obviously.
But youtube/news/reddit/facebook/literally any website needs to make money somehow, and non tracked ads pay rediculously lower amounts(90+%). So the real question is, would you trade what is essentially anonymous tracking (literally no one at Google gives a damn who you are, it's just a unique id fed into an algorithm) for content, or would you rather pay for each and every site on the internet?
I used to help out with the technical infrastructure for wikipedia and even as the 8th most trafficked site on the internet the cost was radically lower than that of some of my dayjob customers with comparable levels of traffic. A non-trivial part of the reason for that was that it didn't need all the surveillance and realtime per-request bespoke generation that advertisement encumbered services require. Part of it was just that there is an awful lot of fat in big commercial operations, since they have enough revenue that it's not a concern. Some is probably because when you're chasing ad dollars the extra spend required to get a 30 ms response instead of a 60ms response might be justified on the basis of the marginal clicks it generates, but those costs shouldn't properly be considered part of the underlying distribution costs.
I think "yes, I'd be happy to pay for all of it" would be significantly more attractive if paying meant just the component of the cost that actually went to the authors of the material -- on the order of a hundredth of a cent per view. Here is $1, that'll cover the authors share of all my household's youtube viewing for the next several years.
But that’s not what the adtech web is, and that’s why it needs to die.
I’d be absolutely delighted to see 99% of ad funded content and services wiped off the face of the web tomorrow, even if a lot of it is stuff I use and enjoy. Which is why I’m bullish on “nuclear options” for regulation such as binding DNT to GDPR, or making opt out default and not even allowing pop-up opt in, or suing a few large companies into bankruptcy for using consent dark pattens or other slight violations, just to make an example.
Because it’s not that ads would be banned by that. It’s tracking and micro targeted ads. And while ads may be necessary to provide “free” services, we need to get away from the idea that the only ads worth showing are ones with precise targeting and advance led fraud countermeasures.
My guess is about the same 94% / 6%. But I'm not sure it's the 6% that represents lizardman.
> “We value your privacy…”
There's a special place in hell for the originators of this Orwellian triple-speak. Three meanings overlapped: meant to give users the impression that "we respect your privacy" or "we protect your privacy"--but it doesn't say that. It says they value your privacy, which is literally true; they're happy to put a dollar amount on it and sell or mine it, because it's of value to them. That's two true senses; one inferred and one real. But the second gives rise to the third, Orwellian meaning: these people actually don't value your privacy at all. It's a lie, pure and simple. A special kind of lie that has such a flippant, corporate-speak, technically-accurate dry meaning that is meant for a court of law--it holds up when lawyers start arguing long technicalities before a judge--but just one level deeper it is despairingly careless and dystopian.
When I see this phrase I just know I don't want whatever it is these people are going to do with my data. They benefit and I don't.
See e.g. "fat free" labels on sugar-candy.
Maybe people just goes to the cheaper option available because increasing inequality has left millions with no disposable income.
This is the same logic by which people prefers cheap furniture instead of high quality one. It is not a preference for cheap but it is just what they can afford, if they had more money most people will get the higher quality version and pay the price.
No disposable income would mean not being able to spend money, meaning never buy anything. If this was true the ad industry would very quickly run bankrupt. And yet the revenues of Google and Facebook indicate otherwise.
In other words, what does it mean to be "tracked"? How was the question worded? You can change this value from 1% all the way up to 99% depending on how you ask the question, because "being tracked" is open to interpretation.
Truly a useless poll.
The reason why I didn't define "tracked" for the people taking the poll is that I think that's the most representative way to replicate the question that consent boxes are theoretical asking, but in an abstract way out of the context of a specific website. When a user sees that consent box, they have little to no idea of what tracking is actually happening.
As far as I understand the GDPR, there should be no downside to rejecting the tracking technologies. And Websites relying on advertising, could do so with unpersonalized ads (granted with very different metrics).
Nevertheless, to me it feels, like these paid vs tracked offers are not what the authors of the GDPR intended.
What I don't understand, is that it is okay to refuse the service if you don't consent for the ad-supported tracking version. I mean, not giving your consent, means you have to pay if you want to use the service. Doesn't feel like you have a free choice (pun intended).
I think it works similar in real life as well. You don't mind if a store clerk recognises you coming back to a shop. But you would be kind of creeped out if he also knew where and when you last visited your dentist.
And sharing with anyone else is right out.
In most cases sites could just use necessary tracking without permission but seek explicit opt-in for the rest. They won't do it that way of course because most people won't op-in unless nagged or conned into doing so.
(the other reason for consent questions where they are not really needed is people playing safe lest they have to defend the lack of a consent form in legal court or the court of public outrage)
0 https://www.destroyallsoftware.com/talks/the-birth-and-death...
There was "accept all" and "manage" buttons, and under the "manage" button there were 100s (yes, multiple 100s) of sites listed, each with consent on by default, and to disable everything I had to manually scroll through it all and click on each item.
From formulation stage, it allows two parallel realities. One in which "consent" or whatnot have an intuitive meaning. Politicians and activists live here. One in which "consent" is a specification for a compliant "3rd party solution." Businesses live here.
Once functional, the bank, website or other complying business can claim the rules were made up by the regulator, and the regulator can either claim to be successful or blame businesses for problems. Being unremarkable, by copying or using 3rd party comliance plugins is more important than respecting the regulation's intent.
By the time consumers are affected, "consent" has a technical specification. It no longer has much actual meaning.
Think of the hoops, paperwork, informed consent signatures and such that you deal with at a bank. All those things had some regulatory logic behind them initially. Employees at the bank think this is government-required paperwork. What it is is compliance.
Disclosure: I am a privacy researcher and co-founder of GPC.
[1] https://globalprivacycontrol.org/ [2] https://gpcsup.com/
There is too much info out there. Context about the user shrinks the ever growing info sphere.
Many sites seem to be offering a choice of [Accept] or [Settings]. I've never clicked on [Settings]; it might as well lead to a goatse, as far as I care. I don't want to choose options on a cookie banner at all, so why would I opt to see a complete separate page of options?
yet nothing is done about it, as if it is not important, as if digital tech is not the main, if not only, "driver of future economic growth".
you cannot build a stable economy (well at least not of a democratic, free market kind) relying on complete ignorance and regulatory capture.
anybody who has leverage on the matter and is not doing something about it is complicit in a gigantic, generational level misallocation of resources that will eventually have to be written of.
(I made the number up, but I'd be very surprised if I am off by magnitudes).
My gut feel is that most users don't understand what "tracking" really means and simply want to enjoy digital experiences without having to engage with the "what & how" that comes with modern platforms. My partner (and I reckon many other "normal" people) loves the ads on Instagram and enjoys the personalisation on YouTube & Netflix.
The current solution of "be transparent and it's okay" just forces decisions onto consumers that they do not want. The "good" solution outlined in this article (accept/reject) is a big yes or a big no – a green light for everything vs a red light for nothing.
There needs to be further regulation in this space that defines legal vs illegal uses of user data beyond the principals laid out in GDPR. The regulation should aim to define the "what & how", relieving the burden of choice from users and removing this "consent and it's okay" loophole for invasive use of data.