Microsoft: Require user consent before sending any telemetry | Better HN

122 comments

65 comments · 11 top-level

fhub2y ago· 14 in thread

Truly anonymous data is not subject to the GDPR. So the question is whether the data they are collecting is truly anonymous. They seem to be claiming or suggesting "Yes it is" https://code.visualstudio.com/docs/getstarted/telemetry#_gdp....

It is neigh impossible to send truly anonymous data as telemetry. As soon as you're using the internet, you're disclosing an IP address, which is PII. If you add anything to link two subsequent telemetry reports together, that thing is PII (e.g. a hash or a uuid). If the telemetry report is detailed enough that they become somewhat unique, it's PII.

That said, consent is not the only grounds on which you can process PII. Contract, legal obligation, vital interests, public task, or legitimate interests are also valid grounds. Of these, legitimate interests is the most applicable in this situation.

> As soon as you're using the internet, you're disclosing an IP address, which is PII

Yes it's PII which of course is why no one who does Telemetry in a GDPR compliant way would store the IP address. The fact that it's "sent" (in order to send anything at all over http) isn't relevant. Only what's stored, for what reason, and for how long.

> If you add anything to link two subsequent telemetry reports together, that thing is PII (e.g. a hash or a uuid)

Again, no. PII is only information about physical people. Unless the data becomes enough to identify a person (in itself or together with other data), the data is not PII. Having a browser history associated to a random guid might be PII (because the browser history might pinpoint the user, not the guid!). But having a random guid associated to say "has run VS code 12 times this year" is not.

falqun2y ago

>legitimate interests

No, telemetry is not something MS needs to fulfil the primary purpose of VS Code. Best example is that the OSS version is there, without any telemetry enabled by default, still doing by and large the same job.

"Disclosing an IP address" maybe a matter of the medium of comms being inadvertently TCP/IP, if MS does not log or store the IP in a meaningful/reversible way, are they processing PII?

mappu2y ago

With that argument - would it hypothetically be legal for anonymised telemetry to be submitted over Tor?

> It is neigh impossible

Haha sorry I couldn't continue past that! Neeeiiigggh!

Among the telemetry data:

> MacAddressHash - Used to identify a user of VS Code. This is hashed once on the client side and then hashed again on the pipeline side to make it impossible to identify a given user. On VS Code for the Web, a UUID is generated for this case.

A hash of a hash is about as expansive as a hash and it still uniquely identifies a machine, tying telemetry events to a specific user's machine. Microsoft's own telemetry description generator calls the field "EndUserPseudonymizedInformation". Pseudonymisation is inherently not anonymisation.

This bullshit is why I keep my PiHole on for my dev environment.

Unless there is any PII associated with the pseudonym, there is nothing specifically in GDPR that says you can’t or shouldn’t do this so long as it’s not information that can identify a physical person. Note that being able to attribute multiple pieces of data to the same anonymous person does not necessarily identify them (and it’s important to not accidentally do so):

It’s important though if you e.g have multiple products to use a _different_ pseudonymization (hash salt or whatever) otherwise you run the risk of storing data linking too much data on a user thereby de-pseudonymizing them in the worst case even though no individual app does. Having a users behavior across multiple applications could pose such a risk in extreme cases.

Edit: I think it's important to separate "hashing" and "hashing". A properly hashed identifier uses a salt that is generated on the client, so that it can't be used to identify the user. basically: the first time the app runs, you generate a random salt which is only stored on the client, and NEVER sent in telemetry. Anything you would like to transmit over the wire that would risk identifying the user (E.g. a computer name, mac address) you hash with this local salt. This way no one can try to go to the database on the server side and try to match any data e.g. check if the hash abc123 matches the computername jimbob bcause hash("jimbob")= abc123. Just sending hash(MacAddress) without a local random salt would NOT be properly pseudonymous because an attacker on the server side could ask and answer the the question "Does this come from the address macaddress?".

what's the definition of truly anonymous? they don't know your name? or there isn't enough data to identify you? I've heard that in the US, birthday and postal zip code is enough to identify you in most of the country, but that could be considered anonymous.

if data of multiple users is aggregated, that is I think more of what people are thinking when they think "anonymous"

trashtester2y ago

There are multiple definitions. The most basic (and common) is k-anonymity [1]. Basically, for a given collection of data you group by all variables that are already non-anonymous (like age, address, gender, occupation) and end up with groups of fewer than k people (where k=5 is common), any other data items in the data set linked to the same individual also become non-anonymous (PII).

Even if you have groups of size greater than k, though, information elements may be non-anonymous if there is not enough diversity in the group. For instance, if every 49-year-old male on a given postal code in a given occupation has a certain religion, then religion is non-anonymous for that group, according to l-diversity [2].

This can be narrowed down even more by t-closeness [3].

  [1] https://en.wikipedia.org/wiki/K-anonymity
  [2] https://en.wikipedia.org/wiki/L-diversity
  [3] https://en.wikipedia.org/wiki/T-closeness

cowl2y ago

There is no such thing as truly anonymous. in order to send any data you need to connect to a server. at that moment you are in violatation of GDPR because you are exposing the users's IP which is protected by GDPR. See the case where even linking to a CDN requires GDPR consent. https://www.cpomagazine.com/data-protection/leak-of-ip-addre...

And before the army of those who don't understand GDPR comes up with "but then the whole internet can not work"; the crucial distinction comes in the answer to the question: "can this tool fulfill its purpose without this connection? if no, then it's essential to it's functioning and does not require consent, if the tool can fullfll it's purpose without this conection it's optional and does require consent.

GDPR makes a disticntion for connection that are required to fullfill the purpose of the tool and connections that are not essential. So VS code connection to a microsoft Server to let's say update download an extension is allowed and does not require consent becasue without that connection VSCode cannot fullfil its purpose of providing functionality.

Telemetry is not functionaliy and VSCode can execute it's purpose without this connection so that makes it subject to user consent requirement.

By that logic, Ubuntu performs a connectivity check behind the scenes polling connectivity-check.ubuntu.com every few mins to detect if internet connectivity has been lost.

I do not recollect seeing any opt-in Privacy prompt enabling this feature. Surely an OS can function without the internet so it's not "essential to its functioning".

Same with Firefox's captive portal check [1] that helps determine if a Wifi network requires a web-based sign-in or acceptance of terms of use.

[1] https://en.wikipedia.org/wiki/Captive_portal

JCWasmx862y ago

Wouldn't even be checking Microsoft's server be an unnecessary connection? You could argue, that VSCode would still work, as updates are basically optional and could be triggered manually, too

> There is no such thing as truly anonymous. in order to send any data you need to connect to a server. at that moment you are in violatation of GDPR because you are exposing the users's IP which is protected by GDPR.

This is misinformed. There is nothing in the GDPR that relates to "exposing" or "transmitting" anything (other than transmitting further from a processor to a third party). GDPR relates to how data is stored or processed. A program can make any number of http requests, for any reason no matter how unnecessary, so long as that PII (The IP, or similar) isn't stored or otherwise processed/transmitted to a third party in a way that the GDPR concerns. The download web server logs is such a storage (which is why you these days clear those every day, or never log IP at all in them).

> Telemetry is not functionaliy and VSCode can execute it's purpose without this connection so that makes it subject to user consent requirement.

No. It's required because the telemetry data is stored whereas the IP of the update request is not. Had microsoft wanted to store every IP of everyone downloading an update, then that database of IP's/downloads would of course have been subject to the GDPR too. The data isn't less sensitive just because it was from a necessary function. Microsoft's responsibility for that data is exactly the same.

But the easiest way of doing telemetry properly and not worry about GDPR is to not store anything that is PII at all. And it's pretty easy to do so too. Nothing is "Truly anonymous". Telemetry is usually pseudonymous. But it properly pseudonymous telemetry is normally not a privacy concern in any way. The true gripes about telemetry (there are a few valid ones) isn't about that, they are

- People getting a worse experience e.g. a slower product

- People not trusting the companies to adhere to the GDPR with the data transmitted, e.g. you might not trust the server to clear IP's from the transmission (basically the only piece of PII that can't be cleared on the client side because then the package never arrives). But if you don't trust the company to adhere to the GDPR then why would one trust their opt-out does anything? Running any kind of software basically means trust to some extent.

- People feeling cheated because of automatic or hidden opt-in

- People on paid internet connections spending money to send the telemetry.

bogantech2y ago· 12 in thread

To be fair if someone comments to me with things like:

> Please give an answer within the next week until the 16th of June.

I wouldn't respond to them either out of spite

The issue with society or one of them, is thinking its acceptable for a corporation breaking law to feel spite, the guy was not talking to a person, was talking to a shitty corp breaking law

Which law? Instead of shit talking, they can report it, file lawsuit.

gettodachoppa2y ago

The requested deadline is likely done ahead of filing a complaint in Europe, to show they gave ample warning.

Also remember he's not talking to a human, but to a soulless corporation. He was as cordial as could be given the circumstances.

And finally, remember that it doesn't matter if a product Microsoft develops to increase their control over developers (via vendor lock-in, mindshare, and forced telemetry) happens to result in a decent free text editor for the user. No one owes them gratitude. This isn't charity.

P.S. Did you know VSCode lets extensions not respect the user's "no telemetry" choice? It's been an open ticket for like 4 years now, that MS have no intention to ever fix, even though all it would take is a simple VSCode Extension Store EULA change.

Nursie2y ago

I've written to companies in the UK before with similar deadlines, it can be statutory - I am giving you notice that this communication starts the clock on the 30 day period I am required to allow you to give me a satisfactory resolution before I will escalate this case to the relevant authority.

Last time I had to use that sort of language was with a deranged ISP who had failed to deliver an internet connection, then decided to chase a debt for unpaid bills for this non-existent connection two years later.

AJRF2y ago

Virgin Media by any chance? I had them do that to me when I clawed back the money through my bank that they took for an install they never delivered.

It's not the only nor the first comment. They had plenty of time to comment back before.

Yes, but that’s our childish instinct to be affronted at being held to account for what we know we’re responsible for.

mjburgess2y ago

I suspect they are dating it to trigger some terms of the GDPR, eg., reasonable response lengths when notified of infraction

That opens another question: which means of communication would count for that? Does commenting on a GitHub issue really count? Wouldn't you have some sort of contact details specifically for that in a license agreement or similar?

vasco2y ago

GDPR terms allow them to ask for any data about them personally. And Microsoft can say no if for example all the telemetry data is anonymous and aggregated. These attempts at sounding like a lawyer with demands to answer make the issue commenters sound like they are 14 years old and any engagement with that issue will never end unless it's locked.

falqun2y ago

Well, we are talking about GDPR. Setting a date to comply by is part of the enforcement of the GDPR afaik. I bet someone is setting points of a legal case, e.g. MS can say "oh no one explicitly stated a set date and GDPR" - now they cant use that excuse.

bogantech2y ago

I don't see anything here[1] that mentions that made up karen legalese is any part of the process

[1] https://commission.europa.eu/law/law-topic/data-protection/r...

jiggawatts2y ago· 7 in thread

No answer is forthcoming from the VS Code team, because they know you won't like the answer.

Microsoft trawls their[1] endpoints mercilessly for every bit of telemetry that they possibly can, and they go out of their way to prevent customers from disabling this.

Windows 10 or 11 with Office requires something like 200+ individual forms of Microsoft telemetry to be disabled!

Notably:

- They keep changing the name of the environment variables[2] that disable telemetry. For unspecified "reasons".

- They've been caught using "typosquatting" domains like microsft.com for telemetry, because security-conscious admins block microsoft.com wholesale.

- Telemetry is implemented by each product group, which means each individual team has to learn the same lessons over and over, such as: GDPR compliance, asynchronous collection, size limiting, do not retry in a tight loop forever on network failure, etc...

- Customers often experience dramatic speedups by disabling telemetry, which ought not be possible, but that's the reality. Turning off telemetry was "the" trick to making PowerShell Core fast in VS Code, because it literally sent telemetry (synchronously!) from all of: Dotnet Core, PowerShell, the Az/AAD modules, and Visual Studio Code! Opening a new tab would take seconds while this was collected, zipped, and sent. Windows Terminal does the same thing, by the way, so opening a shell can result in like half a dozen network requests to god-knows-where.

[1] You thought, wait... that it's your computer!? It's Microsoft's ad-platform now.

[2] Notice the plural? It's one company! Why can't there be a single globally-obeyed policy setting for this? Oh... oh... because they don't want you to have this setting. That's right... I forgot.

Windows: https://learn.microsoft.com/en-us/windows/privacy/configure-...

PowerShell: https://learn.microsoft.com/en-us/powershell/module/microsof...

DotNet Core: https://learn.microsoft.com/en-us/dotnet/core/tools/telemetr...

Windows Terminal: https://github.com/microsoft/terminal/issues/5331

Az module: https://learn.microsoft.com/en-us/dotnet/api/microsoft.azure...

Etc...

tentacleuno2y ago

> They've been caught using "typosquatting" domains like microsft.com for telemetry, because security-conscious admins block microsoft.com wholesale.

This seems interesting. Do you have any references for this? I would assume that the main use of such typo-squatting domains is a simple redirect, a la [0][1].

[0]: https://gogle.com [1]: https://gooogle.com

We need a "just say no" campaign that boycotts companies employing these slimy behaviours.

shiroiuma2y ago

I've been boycotting Microsoft for around 25 years now... But I've noticed most people, even in the tech world, don't mind supporting companies with slimy behavior.

Microsoft’s own telemetry solutions (AppInsights/LogAnalytics) seem perfectly capable of handing async/buffering/backoff etc.

I agree there should be a single place, at least in Windows to control Microsoft telemetry on a per app basis. It should be very easy to accomplish. On other platforms less so.

In a desktop product I do for work we had the dilemma of opt in/out and showing the query clearly and hiding it in settings. We ended up with the middle ground of showing it but having the checkbox checked (so uncheck to opt out). We were still worried this would leave too few opting in but it meant over 95% did.

For command line I’d be 100% happy with a note on first use describing that telemetry is enabled and how it is disabled. Leaving it disabled by default and requiring user action to enable is not realistic in such a situation.

g-b-r2y ago

A pre-enabled checkbox is invalid for obtaining gdpr consent

chx2y ago

this is why o&o shutup is invaluable

Is it fairly effective these days?

charcircuit2y ago· 7 in thread

I don't get people who request for software and websites to become nagware by asking for consent.

Condition19522y ago

I don’t get why software needs a blank check to report and constantly send surveillance about you

charcircuit2y ago

It is information about how the software is operating.

guappa2y ago

The simple solution is to not do anything that requires consent?

Anonymous/pseudonymous telemetry doesn't necessarily require user consent other than for being polite. If you store PII you do, but if you do that you also aren't really doing anonymous/pseudonymous telemetry to begin with.

charcircuit2y ago

They aren't

_xivi2y ago

> people who request for software and websites to become nagware by asking for consent

What? Lol. How is this the users fault?

That's just dark patterns by companies to bend users into enrolling. It doesn't have to be like this. It could be opt-in under settings, like just about anything else.

It all about power play.

charcircuit2y ago

>How is this the users fault?

If a user asks for the software to nag people and then the developers make the software start nagging proper then it is the fault of the user for suggesting that behaviour be implemented.

>It could be opt-in under settings, like just about anything else.

Or there could be an opt out in settings like how it already works.

not_your_vase2y ago· 6 in thread

Have you noticed that MS mostly stopped using EEE, and changed strategy to just ignore rules/laws/licenses, and wait to see what happens? We hear it frequently that "today's MS is not the same as the old MS", but I have my doubts.

This particular one just the latest. But the really big one (IMHO) is the one where they simply started to ignore EFF[0], when they were asking them about the copyright status of co-pilot. If the court decides against EFF, that will have a lot of effect on the legality and enforcement of most of the OSS licenses (though I'm an armchair-lawyer, not even in the US). Fun times ahead.

[0]: if I remember well, it was EFF, who mentioned that MS stopped responding to them. I have found the lawsuit, but filed by not by the EFF. Google is more useless by the day.

yjftsjthsd-h2y ago

> Have you noticed that MS mostly stopped using EEE,

No, I haven't. Notice that MS now loves Linux... provided you run it on Azure or as a component of Windows (WSL). They adopted Chrome...'s rendering engine and then abused their desktop OS market share to shove the result down people's throats. They don't have the leverage they once enjoyed, but the approach didn't change, at least not in general.

pjmlp2y ago

Hardly any different from FAANG so beloved by FOSS folks....

oaiey2y ago

I think this is a tendency of all internationals mega corporations. Law is not homogenous around the world, and since you are consequently anyway in violation, you learn how to use that in your favor and ignore it for quite a while. And then, once its start to be annoying, you can finance an army of lawyers to delay or even change the law.

For one part it is quite reasonable to work like that, on the other side it is really unethically and bad for the society as a whole.

The current system highly incentivises sufficiently large corporations to embrace the Nike principles: Break the rules, fight the law

The worst case scenario, if you lose a game stacked in your favour several times in a row, you pay a pittance, or performatively correct a now-obsolete injustice.

VScode telemetry will remain opt out because it yields very valuable information. Microsoft is not a democracy, and the outcry here is less than a rounding error, a footnote in some internal director’s morning agenda.

barrysteve2y ago

The current system highly incentivizes pretending not to know.

Obtaining power at any cost requires the internal director to pretend he doesn't know, what he's doing.

The vast majority of social capital is made by lying to people, pretending to not know you've done it and dropping relationships with anybody who is not pulling in your direction.

Silence is vastly underrated, I say ironically, so I shouldn't be typing this out.

jdjdjdhhd2y ago

I do the same.,. But in the last month I got three traffic tickets.... One for not using blinker, one for not keeping my lane and one for speeding

osigurdson2y ago· 3 in thread

A user should be able to configure a program (or all programs) such that outgoing communication is not possible, logged or both. It really shouldn't be up to the program to decide what it wants to send as it could easily scan the entire hard drive on the users behalf.

mostlysimilar2y ago

Little Snitch on macOS and simplewall on Windows. Must-have tools for anyone serious about their right to privacy.

Sakos2y ago

Have you tried running a firewall with explicit prompts? Everything connects home now. It's infuriating.

einpoklum2y ago

The majority of FOSS programs don't connect anywhere - although there has been an increase, for sure.

Last year we had an argument this regarding LibreOffice, where an option to collect some telemetry was suggested as a nagging-opt-in. Opponents argued against this because some fraction of our users will press Accept just to get through the installation, or without understanding what they're accepting; plus we just didn't want this kind of mechanism in a respectable piece of software. For now the idea seems to be dead in the water.

RecycledEle2y ago· 2 in thread

When the owner of a device is using it, they should have the right to inspect all data on that machine in plain language and to inspect all communications to and from that machine (again in plain language.) They should have the right to stop any communications at any level they choose using plain language menus.

hn19862y ago

they need to do this with cars too. Tesla and VW are collecting massive amounts of data. Should be opt-in

tpush2y ago

I'd support that law.

kjellsbells2y ago· 2 in thread

I'm not qualified to weigh in on the merits of the request, but asking a corporation to change something and then throwing in a bunch of legalese about compliance and GDPR seems like an excellent way to guarantee that the poor reviewer of the requests is not going to deal with it, let alone quickly.

At best, they raise it to their internal legal contact. The inhouse lawyer rapidly advises them to not respond in any written or recorded medium. Issue goes nowhere.

At worst, they realize that this is a hairball with "vaguely legal stuff" and decide to review some other issue instead for a more productive and less stressful day. Issue goes nowhere.

This language in the bug makes it easier to build a legal case against them.

It's very easy: Complaints should be directed to whoever is listed in the Personal Data Protection Policy issued by (in this case) Microsoft. The privacy notice (Which nicely seems to be the same one across microsoft products!) clearly says how to complain, as it should https://privacy.microsoft.com/en-us/privacystatement

And that method is not a github comment.

The commenter might have followed the correct route to complain too, but could then at least have said that "I have contacted microsoft at [..] as outlined in the Personal Data Protection Policy and expect a response within [..]"

lloydatkinson2y ago· 1 in thread

Looks like the monthly “people absolutely lose their minds over VS Code telemetry”. The same people would then be complaining if VS Code crashed constantly from bugs that they also never report in place of no telemetry.

LightHugger2y ago

This rediculous false dichotomy of "if not for excessive telemetry it would be crashy" is so beyond reason. If it crashes just pop up the crash reporter and prompt the user with a button to send the crash report in. Done. No ethical issues there.

But no apparantly you think microsoft needs a constant faucet if your information to prevent crashes. Golly i wonder how developers managed before said faucets.

justinclift2y ago

Also: https://github.com/microsoft/vscode/issues/161333

butz2y ago

It would be also great, if VSCode stopped putting random directories into $HOME, even when running in "portable" mode.

j / k navigate · click thread line to collapse