This isn't evil marketers hiding in their underground lair. This is web developers, designers, and product managers gaining insight about their users. They don't package this information and sell it wholesale to advertisers. They use it to make the product better.
You are taking this -way- too seriously. The ability to have perfect information on how users interact with your product is one the earth-shattering advantages we have as makers in the digital world. It means we can make something people want - better and faster than ever before.
I'm not saying that all use of tracking is good. All power can be used for good and for evil. But that doesn't mean that power is implictly evil. Save your rage for when you discover someone actually being evil.
This is entirely just an attempt to get competitive analysis about their competitors at the expense of user privacy.
Yet. Genie, meet bottle.
Thinking this information will never be sold is short sighted and naive.
When it becomes advantageous for them to do so, they'll sell it in a heartbeat and then sell it again. Somewhere in there the FBI and the NSA will start making "requests" for them to share with the government.
What begins as clever social commentary in movies like Minority Report tends to wind up as a sad fact of life years later.
It's used to allow publishers to make more money from their consumers through advertising.
They pump exorbitant amounts of money into advertising, thereby increasing the final price paid by customers without increasing product quality. Furthermore, they use these advertisements to inflate their customers' perceived value of the products, once again, without actually increasing product quality.
Marketers maximize profits, not quality.
So it's okay for them to be evil as long as we don't discover that they are? More seriously, once they have the info, how can we be sure we nail them? The best way is still to ensure that they don't get the information in the first place.
global warming, education reform, prison reform, the national debt, healthcare, literacy, food production, biodiversity, etc.
Marketing is probably the single most important career there is right now, and if there's any hope of humanity making it through the next thousand years then it'll almost certainly be due to improvements in our ability to market things rather than new technology.
I can accept that there are ideas which are critical to the survival of the human race and/or modern civilization, which require mass education, and utterly reject your conclusion.
For example, a newly minted MBA in Marketing needs to pay back school loans. Who has the money to pay them? Who has the job that will bring the most prestige and future earnings? The local farm selling grass-fed beef and organic products or P&G selling highly processed foods that are destroying the environment?
The marketer is going to take the P&G job and then spend their time and brainpower convincing people to buy even more processed crap.
Interesting point, and I agree.
> if there's any hope of humanity making it through the next thousand years then it'll almost certainly be due to improvements in our ability to market things rather than new technology.
If by our ability you mean progressives/the left, then I agree. Surely marketing per se is at best a neutral force.
As the article states, customer A and B could simple put the information they each have about identity XYZ together. Without KISSmetrics having to be involved any further than in providing and maintaining the unique identity to each customer separately.
Even if it was not the intent of KISSmetrics for this to be possible (which I've a hard time believing), the chosen implementation makes it possible.
"These services are using practically every known method to circumvent user attempts to protect their privacy (Cookies, Flash Cookies, HTML5, CSS, Cache Cookies/Etags…)"
They may not share information about specific users, but doesn't mean they don't use it to sell information in some aggregate form.
but if they change their mind, there is no way I can stop them - right?
I'm highly suspicious of that claim. The only site I have whitelisted is reddit, and I found the i.kissmetrics.com cookie in with the rest. That's not to say reddit isn't using them, but I'd be surprised given their very cautious approach to advertising.
Well, I guess you can always maintain a hosts file but that's the only way I can think of.
Compare with spam. Technological solutions have reduced the problem, but to virtually eliminate it requires global law enforcement.
There is value in what they are doing, and there's absolutely nothing wrong with it. They are tracking user behavior completely anonymously.
I'm not going to make any moral judgments about their team since I don't know any of them. Even if they are on the up and up and mean the best, IMO the technology they are employing is illegitimate at best - it violates user expectations when using the internet, and IMO makes the industry more dangerous for the rest of us by giving legislators and luddites more ammunition. I am absolutely against supercookies.
I'm not sure why it's relevant that they're tracking users anonymously - the user gets to decide at the end of the day who they share their information with. To make this decision for them is presumptuous, to force them to comply despite their implied non-consent is the height of arrogance.
If I installed an application on your computer that sent me the names of all the applications you ran, when you ran them and how long for. That reinstalled itself when you attempted to remove it and that used every technique it could to gather information about you, you would want me arrested as a hacker.
I think what they do is disgusting, and I hope to hell that stories like this kill the traction they have gained with some big-name clients.
Just because you, a human, cant look at the millions of data points and go "oh look, there's george tomlinson of 28 esperay avenue doing something we dont like" doesnt mean that it cannot be done or will not be done, or indeed is not being done already.
Some of us dont want to walk around with yellow badges thank you. Do you imagine that fact that the badges are only visible to those with the resources and motive to discover them, and not the average joe, is more, or less of a motivation for privacy?
[Edit: Made the wording clearer]
I have a SaaS app and I use KISSmetrics to learn what sorts of things engage visitors and customers the most. It's helped me make critical decisions that benefit both me and my customers (by improving multi-step processes).
Actually, given what you said in that post, why do you force it on me?
Don't bother deny it -- you wouldn't know if they are getting traction if you weren't working with them.
Tracking isn't evil. Tracking people who specifically do not want to be tracked is evil.
^ according to jscheel's assessment of fair tradethat's funny, most of the web content i use most heavily was 'made' and 'distributed' without any input from the marketers / advertisers / trackers whom you are defending. not all tracking is evil, but the evil tracking offsets the good stuff by a wide margin. not all gun owners are evil but if you're running a company that uses that defense to justify business practices that require an invisible gun owner to be in my living room, i'm not going to be caring about how much my living room experience has improved.
It's the commercials that make non-premium TV possible in its current free-ish structure. It seems that broadcasters should run their business with the assumption that a certain number of people are going to go the the bathroom and miss some commercials. And web companies should run their businesses and still be able to provide their free-ish services if some people decide to opt out of tracking.
What if everyone went to the bathroom all the time and missed all the commercials? That doesn't seem to happen.
Well the people that make and distribute that content need information to make your experience better.
If somebody wants my feedback, they can ask for it, and if I have the time and like/value their service, I will gladly comply. Simply 'taking' my feedback doesn't sit well with me.because they ask my permission, dont they? I mean, I can choose between not seeing their contact or being tracked - yes?
If, on the other hand, you choose to deny access to your data while continuing to use the Web services in question, you would be at fault (using grandparent's definition of fair trade).
By the way, using Noscript has made me aware of something that I didn't previously know: many sites call Javascript from lots of other domains. I've seen websites with as many as 18 other domains listed on the Noscript pull down menu. And I have seen an increasing number of XSS alerts as well.
I see this as biting us in the butt sometime. Maybe not today, maybe not tomorrow, but soon, and for the rest of your life.
What's more annoying is playing the "NoScript allow roulette" game of trying to figure out which domains/scripts you have to allow for some site feature to work.
I suppose when the government gets in the game, either through direct tracking or just making laws requiring tracking companies to keep particular data for particular lengths of time, then it will be a civil liberties issue and I'll care more. But it will probably also be illegal by then to circumvent tracking.
But that's just crazy. That would be like the government demanding that ISPs keep credit card information on their customers.
The etag mechanism will return each user a different etag for a piece of content, so the browser will send an etag changed request with that etag in. This will be stored with the browser cache.
I recall menus in Safari, Mobile Safari, Firefox and Chrome which listed all the databases stored, along with the name. It was in Preferences near the cookie and password management.
It looks like the 'databases' menu is no longer in Mobile Safari preferences, and now Safari 5.1 will tell you what a website is storing in general terms, but no longer details the individual databases in preferences.
Here I have a non-private session, where I have request i.js (a second time), invoking an If-None-Match check with my non-private ETag of i.js. Opening a private session, my request to i.js does not invoke my non-private session's ETag and subsequent If-None-Match -- i.js is fetched as if my session has no memory of the URI.
In the second shot, I had closed my private session opened in the first test, and I then opened a new private session, without closing my previous non-private session. Again, my private session requests a new i.js, with no idea of the non-private session's nor the first, now closed, private session's version.
The onus is on browsers to restrict inner-private-session storage from leaking between tabs, but it could be quite messy.
Source: http://www.adobe.com/devnet/flashplayer/articles/privacy_mod...
Local storage here refers to "Flash cookies".
It works by setting a unique cache tag (etag as in screenshot) for each user of a resource such as HTML, JPG, GIF, etc files. The later requests can then be extrapolated of what the user views per site. It's in effect, a cookie.
I think it's quite brilliant as an alternative to cookies but unfortunately I can't use it as a form of cookies as they are not a HTTP standard.
More: https://secure.wikimedia.org/wikipedia/en/wiki/HTTP_ETag
Very interesting article, but the proclamation you can't avoid it seems a bit too far. When my browser exits it both deletes cookies and clears the cache, which looks like it's enough to break the tracks.
I use FlashBlock, which I think is enough, because they're apparently using flash cookies to recreate regular HTTP cookies (or something like that).
Flash is a huge POS in so many ways.
HTTP cookies, flash cookies, ETags, HTML5 databases ... it just goes on and on.
Looking at the researchers' paper...
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1898390
...it's not clear that's what they're claiming. One quote is that "Even in private browsing mode, ETags can track the user during a browser session."
That suggests they may be concerned about cross-site tracking within a single private session, and the possible expectation that 'private browsing' prevents tracking from site to site. (I've never had that expectation; only that a private session is not connected to distinct prior private and non-private sessions.)
It seems their method relies on using cached javascript files to identify a user. How then are they able to track the same user using a different browser? Is it by IP address?
Flash cookies. Presumably Silverlight has an equivalent.
(And I even heard once that Windows Media Player shares cookies with IE regardless of the browser that it is embedded in.)
Also, what's with referring to ETags as a "theoretical technique never before seen in the wild"? It's pretty friggin standard.
Then the visitor's browser sends the Etag back to the server (in an "If-None-Match" header), and thus it acts as a quasi-cookie.
Which means I/we am/are either misunderstanding something, or the people who designed privacy and cache-clearing tools had a massive blind-spot.
Instead of regulating everytime we see a practice that we may not agree on, how about we treat it like when the "iPhone location" fiasco broke. Do not criminalize the possession of customer data or even tracking, criminalize distribution or malicious use of it. If Company A wants to know where I came from, so that they can share their ad dollars effectively, I am ok with it. But do ensure that they dont share it with other companies in that network (whether Kissmetrics or someone else) for any reason. My online identity remains my own, it does not need to be dissected for further analysis by doubleclick, kissmetrics et al.
I bought this computer. I pay for my internet connection. And someone like KISSMetrics wants to spy on me using MY stuff?
To profit from MY computer tracking me against my express commands? Incognito mode, cookies turned off and they're tricking my computer into tracking me?
These are people who have lost all perspective of what's right and wrong.
Analytics is a solved problem, there's no innovation here, there's cookies and a way of opting out of it. If regulation is what's needed to stop scum like Kissmetrics from violating my privacy, then regulation's what's needed.
You may pay for your computer and internet connection but not for the (vast majority of) sites you visit. This popular sense of entitlement is problematic when "your" stuff live in 3rd party servers running 3rd party software that you're not paying for.
The first sites to exploit this were, as always, porn sites. They used Etags in referral tracking to avoid webmaster fraud. (the webmaster would have to include a script from the affiliate co which would set an Etag).
You know what is more interesting? The Last-Modified header. The HTTP spec says that you are supposed to put a date in there, but it also says not to bother parsing the date if you are a client since date parsing is such a pain in the ass. So clients just copy the date string and store it and then replay it subsequent requests.
you can put whatever the hell you want in a last-modified field and all browsers will just store it and then replay it later in subsequent requests to the same resource. for eg.
initial request:
GET /_modified_test HTTP/1.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Cache-Control: max-age=0
Connection: keep-alive
Host: localhost:8888
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_6) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.830.0 Safari/535.1
initial server response from my dev server (note Last-Modified header used): HTTP/1.0 200 OK
Server: Dev/1.0
Date: Sat, 30 Jul 2011 11:48:25 GMT
content-type: text/html; charset=utf8
Last-Modified: random_token_i_set
Cache-Control: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Content-Length: 1634
subsequent browser request to the same resource: GET /_modified_test HTTP/1.1
Host: localhost:8888
Connection: keep-alive
Cache-Control: max-age=0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_6) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.830.0 Safari/535.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
If-Modified-Since: random_token_i_set
with new webapps now being single-page with either hashchange or pushstate support, it means almost all requests are made on the backend to the same resource, so you can track the user across all pages on the entire site and across other sites.concerning, but a known problem. even with these headers patched there is still a lot of information that can be used to fingerprint clients (ie. having everything switched off is still a fingerprint that makes you unique). I don't think chrome, safari, IE or Firefox will ever implement these advanced features, it will be up to somebody else to release a browser that is more privacy aware or to maintain a plugin that is.
I wrote a plugin that does this, but a lot of information still leaks through (it is in my github but I haven't released/announced it in any way). I am contemplating just forking webkit and doing a whole separate 'privacy aware' browser but haven't found the time. in short, the browser makers know about this, and have known about it for years - there is just no real interest in providing user tools to fully anonymize users.
Edit: if anybody is interested in the plugin it is here: https://github.com/nikcub/Parley
it blocks all third party requests and provides other features. it works, just needs a bit of a clean up and release.
But even if browsers makers would check the "If-modified-since" against that format then it would still be doable to give each visitor a slightly different date and track them that way.
Combining the date stamps on 2 or 3 jpg or css files present on every page on the site should give you enough entropy for even the highest traffic sites and make it very hard to detect.
The cunning part technically is their repurposing of "etags". These aren't that widely known about but it's a mechanism by which you can ask a webserver "I've already downloaded this file before, has it changed?". Typically the etag will be a revision number, or a hash of the file. The header to create one looks like this:
ETag: "686897696a7c876b7e"
And then in future requests your browser will include the header:
If-None-Match: "686897696a7c876b7e"
In the request. If the file hasn't changed since you last downloaded it, you get a 304 Not Modified. Given that you can store absolutely arbitrary data in the ETag, it's easy to see how this can be used to track users (and the same applies to the Last-Modified header, which is treated exactly like an ETag by your browser despite containing a date).
But when companies do it to people, oh its just a clever programming trick, and its not a problem because you could install additional software to prevent it from happening [3].
The law is showing up pretty clear that simply because you can access a computer system, does not mean that you may, and indeed that doing so without the user's permission is a crime. Causing a computer to store data on a user and then serve that data back to another computer seems dodgy without permission. Doing it when the user has taken reasonable steps to prevent it from happening? Class action time!
[1] http://www.techdirt.com/articles/20110722/02351315202/how-ci...
[2] http://www.geek.com/articles/geek-pick/aaron-swartz-spent-mo...
Edit: seems so: snip ... the persistent tracking can only be avoided by erasing the browser cache between visits.
I don't think arms race is a good analogy here. Arms race is a good analogy for virus-makers and antivirus software, since their goals are exact opposites.
The goal of analytics sites like KISSmetrics is to measure and understand the behavior of their customers as a group, not as specific individuals. The goal of people who wish to remain untracked is to avoid having personally identifiable information about them stored without their consent. These goals are not opposites and don't necessarily result in an arms race.
we're planning to follow up with a post that has the technical details of the Etag stuff (sorry about 'light on detail', it was a press piece after all).
you're right in that it's been a known method that has been written before (samy had it in evercookie which we site in the paper and a few others have blogged about it). what seemed new (at least to me) was actually encountering it 'in the wild' on a top50 site like hulu. if this type of thing been written about before, definitely let me know so we can cite it.
fwiw, yes noscript would block the javascript that kissmetrics uses to respawn using html5/etags, however there's still the swf that regenerates using flash cookies. also josh highlights ways the you could do this with javascript disabled using CSS (kissmetrics actually also uses hidden values in CSS as well if you look at the src)
either way, blocking javascript/flash would render hulu, and other 'rich media' services like it, largely useless unfortunately.
RE: foxnews/polldaddy. actually they were naming their database 'evercookie' some time ago although they've seemed to have changed that (now it's just called pd_poll__). you can see the script they use here which they use html5 and swf databases: http://pastebin.com/0ieZ2i22 (prettyfied from http://static.polldaddy.com/p/4424060.js )
it's likely that polldaddy/foxnews are using these techniques so to ensure that a given computer only gets to vote 'once'. however, i think there are probably much better ways to do this.
hope that helps. i'll link a blogpost down here somewhere (which means that i actually have to start blogging finally ;)
I've quite a few /etc/hosts entries, blocking third party cookies, clearing cookies & cache on close, no flash cookies, and so on, but I always expect they'll be something they can find still.
Details here: http://ashkansoltani.org/docs/respawn_redux.html
Feel free to send comments/suggestions.
Also nikcub - very enlightening about the Last-Modified header! It reinforces my point that the solution to all this might not be technical but require policy guidance as to best practices, etc.
Anyways, assuming they could offer their service tracking only on a customer's site, they should be serving from a subdomain, no?