I'm going to have to disagree with the authors of the paper, here.
Whilst the information they've found may not appear to be PII at first, it is very far from anonymous.
It has everything required for active fingerprinting of individual devices - namely, the UIDs of the hardware of the computer. Things that don't regularly change, and things that may show habits.
Combining this dataset with another is all it would take to break from pseudo-anonymous to known individuals. However, enough information is there to uniquely fingerprint most users.
From a PR perspective it sort of looks bad but please look at the comparison table on the paper where similar data is collected in the registry,logs,prefetch,etc... Unlike *nix,windows does a ton of activity logging and it has been this way for a long time. Most people know of application,system and security logs for example but in the same log directory there are usually 100-200 other log files including IE browsing related logs.
Which is why I didn't call it PII, but also emphasised that is also not anonymous.
The paper describes the format of these files, and what data can be obtained from them, including a comparison with other sources of similar information.
Recorded data includes: (1) Windows version, registration details, installed and uninstalled programs; (2) hardware devices with serial numbers; (3) process execution data (at Enhanced or Full levels only, data might not include processes that only ran briefly); (4) partition table and boot timestamps (when the system was powered on and off).
In the analyzed examples the data was available for roughly the past three months.
Ouch. That sounds a lot like personally identifiable information.
I have questions:
1) is turning off telemetry (opt-out) effective against this? 2) How will this be different between licenses? I would be very interested to see what is collectes when you have something like an E5 license and have Defender ATP and AIP turned on (I don't have that currently). I recall it sends a ton of data (>2000k dns requests/hour for an active user just for new connections to MS) perhaps some of that is left on disk? Would file classification with AIP (e.g.: new document/email is created) be logged? Is it fair to assume the Win10 they tested with is not for enterprise?