Will AA be able to find a single customer who has a problem with what TPG does? What is their case then exactly? Would they similarly sue the app if customers were copy pasting data into it, rather than accessing it programmatically?
FYI the title is editorialized. This suit has nothing to do with screen scraping, but just data access in general. Services like these nowadays almost always use private APIs (built for mobile clients and SPAs) rather than parse HTML.
Speculating, but TOS generally say that you're not allowed to share your password with anyone. Assuming that's true for the AA site, AA might argue that TPG is encouraging users to break their TOS ("tortious interference").
Though if app doesn't actually share the password with TPG and just uses it locally there may well be a question of whether entering your password into a third-party app actually counts as sharing it with that party. How exactly would this be different logging in from a web browser? It's just a different kind of user agent. Are Google, Mozilla, Microsoft, and Apple guilty of tortious interference simply because the software they release has access to your passwords on your own machine and could report them back? (For that matter, they even store the synced passwords on their servers, though in principle those are supposed to be private.)
Of course there could also be specific terms in the TOS against accessing the service with unapproved user agents, independent of any prohibition on sharing credentials.
1. Breach of Contract 2. Tortious Interference with a Contract 3. Unfair Competition by Misappropriation 4. Trespass 5. Trademark infringement 6. Dilution 7. Dilution under Texas State Law 8. False Designation of Origin 9. Copyright Infringement 10. CFAA 11. Violation of Texas Harmful Access by Computer Act 12. Unjust Enrichment
They specifically claim it's presenting something that is intentionally confusingly similar to an official AA logon screen using copyrighted and trademarked AA content to harvest customer username and password info, does not prominently note it's nonaffiliation with AA, and directly violates the TOS itself, as well as arguing tortious interference, copyright infringement, trademark infringement, trademark dilution, and violation of the CFAA and the Texas equivalent.
They also note a similar dispute settled with a separate firm owned by the same parent before TPG tried to negotiate a deal with AA for permission to access customer data for this purpose and was turned down.
edit: No it doesn't, brain fart.
That they're stealing AA's opportunity for, ahem, "customer engagement".
They may have a case for tortious interference, I think? It does seem like an uphill climb. The strategy might just be to punish TPG with attorneys' fees and discourage this practice in the future.
In this case an individual customer authorized the access and only their data was affected. This is a pretty common use case for a large category of apps – think Mint, Plaid, all wallet apps which organize and track different accounts.
Edit: typo. I wrote „pwn“ instead of „own“. But then I thought about it and somehow it feels appropriate.
The issue isn’t whether or not it’s what the customers wanted. The issue is that TPG wasn’t the party that entered into the agreement with AA when creating the account.
I’m not suggesting it’s right or wrong, but that’s the issue.
They think the data belongs to them. They think they own the consumers and all the data they generate. To them, it's their consumer data, something they abuse to get consumers to log in and look at ads, and they'll be damned if they let some random software have access to it no matter what the user wants.
GDPR makes it clear that personal data belongs to the person, these companies are just middle men who are temporarily processing it. Cases like this perfectly illustrate why such a change in perspective is necessary. Access to this data is a privilege and it can be revoked.
>Let's create a crappy experience that the user has to put up with, to squeeze a few extra cents per year.
# Default robots file version:2
User-agent: \*
Disallow: /calendar/action\*
Disallow: /events/action\*
Allow: /*.css
Allow: /*.js
Disallow: /\*?
Crawl-delay: 3
Sitemap: https://mccarthygarberlaw.com/sitemap_index.xmlBusinesses are often the ones structuring agreements so that they can sue customers there and so that customers are forced to sue them there, so...that would seem counterproductive.
Even before considering the size of the market you’d be cutting off.
The same issue and battle is playing out for US healthcare as well due to the recent rule forcing hospitals to make prices public.
Although the airline industry can be considered a commodity industry, the airline rewards miles industry is less so. What those miles can get you, can essentially change at any time if the airline says so.
This ranges from all of those StackOverflow scraping websites in your Google search results to companies that want to scrape Facebook for images and personal info to build a database of everyone.
Basically: Well-behaved and well-intentioned scraping bots are rare. You’d get a lot of users setting update rates to 60 seconds that did a new login every time and creating as much traffic as 1000 users. Then they’d release the script for integration with something people and suddenly you have 1000 people each creating 1000 times as many login requests as a single user.
Another common problem was forgetting to implement reasonable back off for failures. A lot of newbies write scripts that immediately retry on a tight infinite loop whenever something goes wrong, sending a huge stream of requests to your server if the API changes or when it goes down. Again, multiply this by many users sharing a script and it becomes a problem.
Then of course there are people trying to make a business out of extracting your company’s data, such as putting it in some other website where they can serve ads over your content or whatever (think of all of those StackOverflow scraping websites in Google)
Basically, you can’t investigate the motivations of each individual user. You just block them all.
What's illegitimate is that "attempting to ban programmatic access" is on the table as a legal redress.
The only way, from a technical moral standpoint, I could see that being remotely reasonable is if there was 1:1 feature and access parity with an API, then being able to legally force agents to use the API.
But critically that's 1:1 feature - if a user can do it, the API offers a method to do it.
And 1:1 access - if an unauthenticated user can do it, then no mandating an account is required for API use. And if any user can do it, then any user will be approved for an API key.
Otherwise, it's just ceding more power to companies.
You don’t need to scrape stack overflow, you can just download a .zip
That’s one reason why people use it: they can’t just gate you off from the content you’ve created. You and others can (and will/do) have a copy of it all.
https://meta.stackoverflow.com/questions/295508/download-sta...
Why can't you just ignore API requests once it exceeds a threshold rate?
If I can do this by hand there's no legal reason I can't do it by machine. You can try to defend against it, I guess, but the second you start impacting your obligations to someone else (like disabling their account after they paid you) you are in the wrong.
but not if you write it yourself, apparently.
My biggest fear/risk was getting noticed and sued by the company, merely for letting users request HTML and display it differently than the company wished. The app is no longer available, so I guess I survived but the idea of $BIGCO crushing me with lawyers was chilling.
except we didn't write this platform, we paid for it. and it is licensed per 100 queries.
so when. a single account starts doing 30k queries every hour, 24/7. the pocketbook was directly hit. and yes, web scraping was in the tos that every account agreed to but never read
> The LinkedIn dispute arose out of hiQ’s use of automated bots to scrape massive amounts of information from publicly available LinkedIn user profiles. Thus far, lower courts have sided with hiQ on grounds that certain information on the site is publicly available and could be accessed by the public without entering a password. [1]
There are similarities, however, different context in that in hiQ's case, information was publicly available, but in TPG's case the owner of that data (the AA customer) is providing them access to the data. The customer could just as well copy / download / screenshot the data, etc, and transfer it to TPG, obviously most people wouldn't bother, so that should be the core of the argument here. Is a user allowed to make their data available to a third-party? Screen-scraping is a means to an end
[1] https://news.bloomberglaw.com/us-law-week/supreme-court-scra...
Would Tech E&O cover something like this, or are there riders that would need to be added?
It seems like something that could be strongly defended, and Red Ventures (TPG owner) is a large conglomerate so I doubt legal funds is an issue.
Screen Scraping is essentially interacting with the DOM extract information. American Airlines can't conceivably attempt to limit programmatic interaction with the DOM because that is a core component of how the web works including accessibility tools such as screen readers, and browser plugins/extensions.
I agree.
My faith would be upon appeal or enforcement, such a ruling would get overturned, or dismissed.
I'm a much bigger fan of https://www.doctorofcredit.com/ for this reason.
This case is about data behind an auth screen, so it may not so easily fall under the definition of public stuff
So not even sure your comment applies here.
The original promise of client/server services was that the server would provide data on a universal open data format, and the USER AGENT (initially a web browser, but other kinds were expected) would process it in a format to the liking of the user, and satisfying their needs.
Compare this to the current situation where the industry standard is that the servers do indeed provide data through somewhat standardized APIs, but the browser or native app is developed by the same vendor and serves their commercial interests, not those of the user as a customer. The only standard customization recognized to users is light theme / dark theme, and it has only started a few months ago.
When it was built by geeks, it was difficult to use but it was meant to serve people. Sellers turned "serving people" into that other meaning.
Why would any company provide data in a standard API so that users can use that data in a standard way? If there was an API for banking, Bank of America wouldn't be able to showcase a professional-esque design and Sofi wouldn't be able to showcase a cartoonish modern design to strengthen their image and attract customers. How do they then attract customers? Only by features and lower margins, which is the opposite of what would make them money.
The idea was to have computers support users, not customers. These words have become synonyms but they have quite different implications. The stated goal was to augment the human intellect, for which you need a well organized corpus of knowledge (think Wikipedia, whose spectacular growth and supporting community came from not being a commercial initiative).
But nowadays the whole industry is focused around building products and services that can be packaged and sold, to the point that its professionals can't even think of any other possibilities when discussing the characteristics of the ecosystem.
Incentives are completely different; it's no wonder that interests of industry are misaligned with actual needs of the final users.
Unlike every other OECD country, the US does not have the English Rule.
https://en.wikipedia.org/wiki/English_rule_%28attorney%27s_f...
My understanding is that screen scraping is taking a picture of the rendered website and using an OCR or some other sort of recognition tools to extract the data.
If it is just scraping - it should be perfectly legal right?
> A Settlement has been proposed in class action litigation against Plaid Inc. (“Plaid”). Plaid enables connections between a user’s financial account(s) and approximately 5,000 mobile and web-based applications (“apps”). This class action alleges Plaid took certain improper actions in connection with this process. The allegations include that Plaid: (1) obtained more financial data than was needed by a user's app, and (2) obtained log-in credentials (username and password) through its interface, known as Plaid Link, which the litigation alleges had the look and feel of the user’s own bank account login screen, when users were actually providing their login credentials directly to Plaid. Plaid denies these allegations and any wrongdoing and maintains that it adequately disclosed and maintained transparency about its practices to consumers.
Streaming services insert ads for other original content when you hit play.
The grocery store has audio and video advertisements for prepare meals playing on loop you can not avoid if you walk past the meat department.
The problem is a race to the bottom on the pricing consumers perceive, and then recovering that money by squeezing every possible touch point.
People have to have enough options to choose a company that doesn’t have to make these compromises.
https://news.ycombinator.com/item?id=29989927
Plaid could file a friend-of-the-court brief here, since they have (presumably!) strong legal grounds to assert that they are legally within their rights to scrape bank websites, as they're doing so as an authorized user-agent, and since browsers are just user-agents, etc.
American Airlines is alleging 12 legal claims:
1. Breach of Contract 2. Tortious Interference with a Contract 3. Unfair Competition by Misappropriation 4. Trespass 5. Trademark infringement 6. Dilution 7. Dilution under Texas State Law 8. False Designation of Origin 9. Copyright Infringement 10. CFAA 11. Violation of Texas Harmful Access by Computer Act 12. Unjust Enrichment
All of my Amazon order emails now only tell me the order number and the total cost, with zero information on what products are included.
Y'all, think of the poor servers! They can't handle the traffic!!! /s
It seems like so many problems, annoyances and inconvenience in modern society are artificially created/maintained just to enable this disgusting industry. Imagine how more efficient things could be if this cancer was eradicated once and for all.
> Imagine how more efficient things could be if this cancer was eradicated once and for all.
Yes!! What I wouldn't give to see this entire industry banned from existence. It would solve so many problems it's ridiculous.
The industry has grown out of this human behavior, not vv.
[edit] Humans have a limited capacity for making purchasing decisions. This is why in 20+ years nobody has made a successful micropayments based product or platform. Further, simply paywalling software or services means 99% of people will not use it. People also are uncomfortable with a system that just withdraws money from their account to pay people - and a universal subscription model isn't particularly tenable due to the competing interests of market participants.
If you can solve this problem with something other than ads, I will be your first investor because you're going to be the richest human on earth.
It's easy to say "ads bad" - ok, but the real question is what are we going to replace them with?
Democracy be damned, there’s money to be made on dietary supplements!
Most people put the value of Google search on the order of thousands of dollars per year [1]. Imagine Google was a paid service: think of the academic disparity between the children of wealthy students who had access to Google and poorer students who do not.
1. https://www.economist.com/graphic-detail/2018/04/25/how-much...
So yea, I have the next billion dollar idea. The gas gauge app. $3.99 a month to know how much farther you can travel. Also, we remove the gauge from inside the car.
Do you mean we should build a starship, Ark B, put all marketers into it, and send them all to a very distant planet?
HBO was a great example of this in the pre-streaming days.
"Want to see great shows and movies with no commercials, pay the extra for cable + HBO and you won't have to seem them again.!"
This default, "it's all free BUT you have to see ads and there are no alternatives" seems to be the problem. The apps described in the article are effectively recognizing this consumer surplus/need and acting on it.
So the service needs to charge even more, which makes paying customers even more valuable in a feedback loop. The usual result is that eventually the service can’t help themselves and starts showing paying customers “a few” ads. Then the definition of a few gets larger and larger.
The only way would be if regulation either mandates a reasonably-priced ad-free tier (priced at the average revenue from an ad-viewing user) or other restrictions (GDPR but actually enforced, the website being liable for the ads it shows, etc) that would make advertising completely unprofitable.
> The apps described in the article are effectively recognizing this consumer surplus/need and acting on it.
Presumably, AA is pissed off because the people that value their time enough and have the skills to set up and use an alternative are people they'd very much want looking at their ads, much more so than the plebs who already use the website and see the ads.
"Industry could not benefit from its increased productivity without a substantial increase in consumer spending. This contributed to the development of mass marketing designed to influence the population's economic behavior on a larger scale.[24] "
Also, banning silicon-based eyes should be declared a massive, massive ADA violation.
We don't need to anthropomorphize HTML/JSON parsing as equivalent to human vision, and then declare any efforts to restrict that vision to be a violation of the Americans with Disabilities Act. Lol.
They have exactly zero right to dictate what I do on my computer. Their HTML is being rendered on my computer and I absolutely reserve the right to delete or modify elements in any way and for any reason. Their javascript is executing on my computer and I absolutely reserve the right to delete and modify functions if I deem necessary.
This seems to fit the definition of an unconscionable contract pretty well.
people dont like paying and they dont mind ads. they enable a lot of products and services that would otherwise not exist. ads might be a net negative, but they certainly have at least some positives.
Edit to add: Do you really have to downvote, don't you have something better to do with your click?
"Unless otherwise noted, all information, AAdvantage® account information, articles, data, images, passwords, Personal Identification Numbers ("PINs"), screens, text, user names, Web pages, or other materials (collectively "Content") appearing on the Site are the exclusive property of American Airlines Group, Inc., or American Airlines, Inc., or their subsidiaries and affiliates"
"You may not copy, display, distribute, download, license, modify, publish, re-post, reproduce, reuse, sell, transmit, use to create a derivative work, or otherwise use the content of the Site for public or commercial purposes. Nothing on the Site shall be construed to confer any grant or license of any intellectual property rights, whether by estoppel, by implication, or otherwise."
Seems pretty cut-and-dry to me.
That would instead be the AAdvantage (AA's reward program) member, who agreed to the TOS originally, and who provided their login information to the TPG app so that it can scrape information about rewards etc.
So... the lawsuit from AA's side seems pretty bizarre, if the facts as presented in this article are true. If AA wanted to stop this, presumably they should sue their own rewards members who use the TPG app. But obviously that won't happen.
So fundamentally, this seems a case of whether the toolmaker is liable for an individual using their tool in a TOS-violating way.
Which seems pretty insane, if AA wins. If I pull open Chrome developer tools after logging into a website that requires me not to inspect its source, why would Google be liable?
---
And as a side note, "Because privacy and security" is quickly becoming the corporate anti-interoperability equivalent of "Think of the children."
The default should be that scraping is allowed.
If companies actually care about privacy and security, then they can offer an API and encourage access through it. But limiting scraping and not offering API access (or intentionally crippling it) is bullshit.
Cut and dried? A lawyer friend told me, informally, that if it went to court a younger judge would throw out my “contract” because it’s silly, but a more senior judge might well take the view that both contracts, my version and that of the web site, are equally silly, and both fall short of the “meeting of minds” standard.
What's interesting here is that there's conflicting precedent... and fundamentally that is what matters. hiQ vs LinkedIn is a great example of accessing data via a scraper that potentially violates the Terms of Services agreement, but found that Microsoft/LinkedIn violated antitrust laws. EF Cultural Travel vs Explorica is another example favoring scrapers. Against that, you have Facebook vs Power.com. Speaking personally, I'd like for clear and explicit rules about what is kosher to scrape and what isn't. Ticket bots are clearly problematic and deserve to burn in hell. Overly aggressive scrapers that incur load shouldn't get a free ride, but stuff like this that is initiated at the client's request and accessing solely the client's data.... I personally believe this should be fair use and would like to see that show up in the law somewhere.
Seems like any web browser by a for-profit company would immediately be in breach.
A web browser cannot be in breach, because a web browser is not a legal entity capable of being a party to an agreement. The entity in breach would be the person using the browser, if they were using it in a way that was against the TOS.
BTW, that TOS is ambiguous. I see two ways it can be parsed. First,
> You may not (copy, display, distribute, download, license, modify, publish, re-post, reproduce, reuse, sell, transmit, use to create a derivative work, or (otherwise use the content of the Site for public or commercial purposes))
I.e., "otherwise use the content of the Site for public or commercial purposes" is one item in the list of prohibited things. Second,
> You may not (copy, display, distribute, download, license, modify, publish, re-post, reproduce, reuse, sell, transmit, use to create a derivative work, or otherwise use the content of the Site) (for public or commercial purposes)
I.e., "otherwise use the Site" is one of the list items, and "for public or commercial purposes" modifies the whole list?
If it is the latter it is saying you can do what you want if it is not for public or commercial purposes.
If it is the former, it is saying you may not do any of the explicitly listed things, and you can't do anything not listed if you are doing that thing for public or commercial purposes. You can only do things that are not explicitly listed and then only if they are private and non-commercial.
I'd guess they meant the latter, because under the former it is hard to see any way to use the site at all without violating the TOS. If that is the case, they should have written it as "You may not for public or commercial purposes <list of things>".
On the other hand, it would't actually be all that surprising for a big company to write a TOS that technically prohibits their users from actually using the site, so who knows?
[0]: https://news.bloomberglaw.com/us-law-week/supreme-court-scra...
I can say in a written agreement with my workers that they are my slaves, or that they can not work anywhere else ever. They can even accept those terms, but that does not make it legal.
There are always fair use clauses that copyright law accepts. That data about a customer is the exclusive property of a company is not really true. In some way it is actually the property of the customer.
True, but there isn't a long standing practice of binding people to the terms and conditions of a service regardless of fastidiousness on the part of the user (whether that's warranted / reasonable or not).
And additional observation is that they're claiming ownership and exclusive rights to things over things which:
a) as broken and dystopian as our intellectual property laws are, it's not immediately apparent you can claim exclusive ownership of. Can you claim property rights on a pin number? on your customer's name? on your customer's phone number? is data even ownable?
b) as above, even if you could, it's not apparent that the airline is the one with the greatest claim to that ownership. Does the airline own my name if I fly with them?
c) the issue with screen scraping may just be scale, automation and commercial value, and it's not apparent you can just wilfully ban competitors from that because you say so. Indeed, is it a violation if an individual uses the information within without screen scraping? cause a lot of those exclusions and terms would seem like they restrict the use of the website and the information for its actual intended purpose on an individual level, screen scraping or not.
d) what are we doing when we read a website but biological screen scraping?
" American Airlines also reserves the right in its sole and unfettered discretion to deny you access to the Site at any time. "
"You agree that this Agreement is made and entered into in Tarrant County, Texas. You agree that Texas law governs this Agreement's interpretation and/or any dispute arising from your access to, dealings with, or use of the Site, without regard to conflicts of law principles. Any lawsuit brought by you related to your access to, dealings with, or use of the Site must be brought in the state or federal courts of Tarrant County, Texas. You agree and understand that you will not bring against American Airlines Group, Inc., American Airlines, Inc., or any of its affiliated entities, agents, directors, employees, and/or officers any class action lawsuit related to your access to, dealings with, or use of the Site."
It could be nice, but it's unenforceable.
Your account information is owned by and proprietary to American Airlines. While you may access your account information through the Site, you may not give access to your account to any person or entity other than a member of your household or a person that you directly supervise as part of your career or employment. You may not give access to your account to any third party on-line service, including, but not limited to any mileage management service, mileage tracking service, or mileage aggregation service.
You must access your account information directly through the Site and not through a third party Website, including but not limited to any mileage management service, mileage tracking service, or mileage aggregation service. You also violate this Agreement if you enable an AAdvantage member to access account information without visiting the Site.
> Seems pretty cut-and-dry to me.
EULA for my hn comments: If you're reading this, you must grant me 'Droit du seigneur' and name your firstborn 'decebalus1'
If I'm understanding the situation correctly (which I may not be) it is AA rewards members who are using TPG's app to access the site. These people give the TPG app their AA login information so it can login to their AA account to get their information.
Arguably it is these AA rewards members who are scraping the site. TPG just supplied the tool those members are using. It would then be the AA rewards members who are the ones who have a contract with the site, not TPG.
That's not what the app was doing:
> The app ... had been ‘screen scraping’ accounts for members
So it wasn't "public or commercial", it was for people who had accounts to view/manage their account details.
Same as Plaid or Mint for banks, or (more generally) any old web browser for literally any website.
I view them as more of a steward of my data. They have permission from me to use it but the ownership should lie with the individual.