Cloudflare shares IP reputation data with partners like Google, coordinated through a program called the Bandwidth Alliance. So, my original offense might not even have been against Cloudflare. It might have received the reputation data from a partner, and it just propagated through the Bandwidth Alliance network.
That's not what Bandwidth Alliance is at all. It's about reducing or eliminating egress fees between a cloud provider and Cloudflare. Not sure where the idea that it's about sharing IP reputation data comes from.
https://www.cloudflare.com/bandwidth-alliance/
So, if Google Search started showing a CAPTCHA that's not Cloudflare.
I've been gradually removing cloudflare based CDNs from services I develop and control because I don't want my users being arbitrarily discriminated against.
There was a good article posted on HN recently titled "The ideal level of fraud is non-zero" which I think is highly relevant here... In essence any mechanism employed to prevent illegitimate use comes with a negative cost to legitimate users, if that cost is too high it defeats the purpose. i.e what's the point in a website that is completely immune to a botnet and also cannot be accessed by anyone else? unplugging the ethernet cable also effectively protects against botnets. More subtly the cost of outright rejecting some legitimate users is usually not worth the savings of rejecting 100% of illegitimate ones. I think Cloudflare's service has it the wrong way around: it currently accept blocking legitimate users far too easily, that is not an acceptable cost; whereas you should be letting a higher level of bots through to avoid pissing off legitimate users - if it's not obviously a DDoS, it's probably worth the bandwidth cost.
Consider the bigger picture, if you save a slither of a penny by blocking a bot, but also end up blocking or seriously inconveniencing 10 real users... is it worth it.
The space is in need of solid competitors to break the stranglehold they have on the internet. Whether it's the right combination of services, documentation, etc.
> The past few weeks I've been getting tons of redirects to verify my humanity before being allowed to view a webpage. Usually I just have to click the box that says human, not find all the ladders in a photo. SoFi is doing it every single time I log in. Petco, too, along with others who are more sporadic. This is happening with and without uBlock on. Same browser I've always used. ...
SoFi and Petco both use Cloudflare. I do exactly zero web crawling / scraping / abusive anything from my home connection.
I'm noticing a recent increase in volume of complaints about Cloudflare's human verification filter. I'm starting to wonder if they touched a dial.
I had already started pulling some infra back from Cloudflare after their last appearance in the tech news cycle. Now I've got an additional reason to continue doing that.
Essentially, for a broad class of web-based businesses, they have made themselves gatekeepers. I'm sure they'll find a profitable use for this position. Charging outright would look bad, but investing in businesses that just happen to not run into Cloudflare-based trouble, but whose competitors do...
I encourage those of you attempting to block Cloudflare to try and host your own website for a bit. Make sure you don't do it on a metered/paid connection. I know one eCommerce site with 1,300 employees that went bankrupt overnight thanks to the AWS bill (and lack of options to get back online, this was prior to companies such as CF). Bankruptcy as in the company filed for bankruptcy and no longer exists. They were profitable for a decade prior. One DDoS attack...
Also make sure you don't have a democratic opinion if you are in the US, like a 50 person manufacturing company. They were shut down completely thanks to saying a single wrong thing about Republicans. CF existed there, but they weren't aware thanks to not having IT folks. They were a non profit.
CF may be evil to some, but there is a reason they exist. I use CF. I don't like throwing money at them every month, however, many of my websites have also been attacked, usually via competitors. We can either deanonymize the internet or allow companies like CF to exist. There is really no other way.
It comes from the Cloudflare blog. https://blog.cloudflare.com/cleaning-up-bad-bots/
There’s a support page about it too. https://developers.cloudflare.com/bots/get-started/free/
Edit: team tells me this idea never got off the ground. Did talk with some potential partners (which did NOT include Google) but didn’t happen. So if Google was throwing CAPTCHAs it wasn’t because of our IP reputation.
Really?
How about _also_ pointing to a knowledge base article for how an end user could go about working out what network activity from their IP might be flagging Cloudflare’s systems?
One source of that would be a blog post on your company's website that was actually authored by you! Point 2 below:
>"Once enabled, when we detect a bad bot, we will do three things: (1) we’re going to disincentivize the bot maker economically by tarpitting them, including requiring them to solve a computationally intensive challenge that will require more of their bot’s CPU; (2) for Bandwidth Alliance partners, we’re going to hand the IP of the bot to the partner and get the bot kicked offline; and (3) we’re going to plant trees to make up for the bot’s carbon cost. [1]
So it's not such a far-fetched notion is it?
https://developers.cloudflare.com/firewall/recipes/block-ip-...
I was surprised to learn Cloudflare was born out of Project Honeypot, so I am guessing Cloudflare does share data with them:
That person should start with the assumption they haven't been misclassified and eliminate the possibility that a device on their network is compromised.
[1]: https://easydns.com/blog/2020/07/20/turns-out-half-the-inter...
We did have it working mostly fine for some time back in 2021 but haven't been able to since.
There are multiple open issues reporting this on the GH repo with no real follow-up from maintainers apart from maybe a "should be fixed, open again if still an issue".
ie https://github.com/privacypass/challenge-bypass-extension/is...
Probably from scam called mail blacklists
If two independent sites believe you are a bot, you or something at your address just might be.
It's only after it happened to him that now he's suddenly against it. Until he removes the same type of blocks from his own website I have absolutely no sympathy for him.
Lets read through that page for a second though:
Drop support for obsolete HTTP versions
Doesn't seem like that's going to cause much issue for any legitimate client from the past 10-20 years. He only recommends blocking HTTP 0.9/1.0, which fair enough Append a #hash to the form’s action URL
Hah. Clever man. I don't see how this is going to stop any legitimate user from loading your website or submitting the form, but I can see how it might frustrate bots. Include a hidden prefilled form field
This is just standard practice to mitigate CSRF. Verify the Host and Origin request headers
Yes. You should be doing that. Set a test cookie and verify it gets included in the submission
Another CSRF trick. Swap the name attributes in the name and email fields
This one's a little user hostile to folks who use assistive devices like screen readers. But still won't prevent you from accessing the site in the first place. Verify the POST/Redirect/GET (PRG) chain
As noted by the author, might cause some issues but again, won't stop anyone from loading your website. Block ancient versions of common browsers
Alright please just don't do this. UA blocking is gross and might prevent access through specialist software. But he also calls this out himself. I strongly discourage you from blocking or discriminating against unknown or uncommon browser User-Agent request headers
All in all, with the exception of UA blocking I don't see how any of these mitigations would result in users not being able to access said website, or having their loading times drastically increased.Using NGinx as an example:
if ($server_protocol != HTTP/2.0) { return 403 'Nope'; }
Another thing I have found useful to drops some bots is to become invisible to them. Many of the poorly written scanning tools do not properly set MSS for reasons I still don't understand. I use this to my advantage.Using IPTables as an example:
/sbin/iptables -t raw -I PREROUTING -i eth0 -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -m tcpmss ! --mss 420:16384 -j DROP
Any TCP packets setting a very low or high MSS or missing MSS will be silently dropped. I drop about 35K packets per host per day on average. This also drops hping3 floods.As long as you're using a <label> or aria-label attribute, that shouldn't be an issue.
(Author here.) If I remember correctly, his browser of choice predates the Origin header.
I just have no sympathy for Daniel since up until just now he was trying to get everyone to do this.
He also specifically called out CAPTCHA as user-hostile.
False positives happen. They happen a lot more than you think. And they are a serious problem. Even more serious when it's cloudflare, but arguing for everyone to implement these algorithmic blocks "that won’t inconvenience users" individually, taken to it's logical end, does the same.
> Bots often mimic the User-Agent of a common browser, but the version numbers used in the bots rarely change. Over time they drift farther and farther behind until a point (maybe two-year-old versions) where you can safely block them without inconveniencing legitimate users.
This supports the idea that browsers are subject to constant change and everyone should be forced to come along (rather than respecting and supporting standards). I have a Chromebook that stopped receiving updates some years ago (thank you for your very safe and sustainable product Google!), his heuristic would litteraly block me.
I have bad news about the most-likely fix for it, longer term, so we can lay off the IP-based reputation stuff and the geo-blocking: it's tying some form of personal ID to your browsing activity, so that bears the reputation instead of the address.
Sorry. Said it was bad news.
Basically, the core problem is digital identities (accounts, IPs, phone #s etc.) are cheap to create (even considering captchas and all) so fraud is easy. The solution could be just to make it "costly" to create new digital identities. For example, you could get a "verified but anonymous" identity issued by locking some assets (could be real world money, or maybe something intangible like community reputation) as collateral with a trusted party (or, for the crypto people, the blockchain). If you misbehave, you lose your reputation on that identity (and essentially your collateral) and have to start over. This lets anyone bootstrap a "minimal" level of trust at the beginning before they can use time to prove themselves trustworthy.
Note: This model might remind some of things like staking in crypto. However the idea is really not anything new... Putting money on the line is really how most low-trust bootstrapping happens.
*: To name a few:(1) this can result in participation being gated by wealth, which can be unfair. (2) it makes accounts more valuable to hack so people need better security practices [re: twitter checkmark]. (3) one would need some authority to decide how accounts lose their collateral or maybe the collateral is just burned to create that initial credibility...
We already use this model in practice. It's why so many services require a phone number verification now - they are hard enough to get en-masse, especially if you block things like Google Voice. They even have a big advantage in that they are comparatively hard to hack, as the SIM card is effectively a weak form of physical security key.
I think the big problems this causes is discussed on HN quite often.
Similar, the internet is already very difficult for the people with limited means. This would make it even harder.
I've always thought that client certs would be an interesting solution to this problem. Any given certificate can carry signatures from multiple signing authorities, right? So we could imagine a world where there are many different certificate authorities, each of whom have their own criteria for signing a particular certificate and each of whom offer different varieties of assurance regarding the signature-holder's identity.
From here, the question of "should I allow the user identified by this client cert to use my service" simply becomes a question of 1.) checking the validity of the signatures of the client cert and 2.) deciding if the CA's criteria for signing certs aligns with my desired userbase.
For example, a particular CA might insist that their users go through some real-world process to renew their certification every few years, but when they sign a cert it means that the bearer has been strongly vetted as a real person.
An interesting side effect of this auth model is that a service provider accepting certs from a particular CA has someone to complain to if a user bearing their signature acts improperly on their platform. You could imagine a CA which has a code of conduct expected of the users whose certs they sign, and would perhaps revoke a user's certification if too many websites complain.
This also reminds me of the anxiety of Google deciding to just ban my account for some reason. They can't be bothered to commit resources to making sure mistakes can be resolved. They don't care to lose a fleetingly small percentage of customers.
Not sure I have an answer. Just a thought.
I'm not understanding the generalized sentiment here. How would, for example, a retailer benefit from this strategy? How does it protect their bottom line?
I can see how a particular kind of "facilitated user economy," such as games, gambling and promotional companies could benefit, but it doesn't seem that broadly applicable to what most people would consider a "mainstream" business.
> so we can lay off the IP-based reputation stuff and the geo-blocking: it's tying some form of personal ID to your browsing activity
And a new market for identity theft is born.
Also, as someone who serves content and geo blocks it, that's not up to me, that's up to the owner of the content or whoever happens to be licensing it for them. So, even if you sent me a picture of your government ID, it changes nothing.
The amount of automated and apparently-manual attempted credit card fraud (and exploit attempts, for that matter) any halfway-prominent site with a CC form is subjected to is hard to appreciate if you've never seen it. It's a whole lot. They aren't even necessarily trying to buy what you have, but to validate that their stolen cards work. And they're quite busy. If too much of that gets through—really, any more than a very tiny amount of it gets through—you're gonna have an extremely bad time.
Various CC service providers like Stripe do provide tools to try to block those attempts, but defense in depth is usually a very good idea, including fairly aggressive firewall-level blocking.
A couple of examples I can think of is blocking bots from scraping their site for pricing and details and from resellers from buying up all of the stock (see sneakers, electronics, etc). The last example doesn't directly impact their bottom line, but it will make customers go elsewhere.
That wouldn't just be bad news, it would be disastrous news. It would immediately render the entirety of the web worthless to me.
And once your spammer has been identified then that's them banned/removed, unable to sign up again.
A few months ago I got on Akamai's naughty list (with my other ISP) for some very light automated website downloading. That was a straight block with HTTP errors and I had to use a proxy to access the Web. It cleared up after a few days.
The lack of any user feedback or support for this situation is really annoying. Reminds you how much power the CDNs have. It'd be really bad if loading websites got as difficult as sending email through all the layers of spam filtering.
I've been noticing this too, and it's why Starlink remains my secondary ISP/bulk transfer connection. If I had to drop one connection, I'd drop Starlink for this reason alone.
There are some sites that I simply can't browse, and it's not Cloudflare errors, either. Lowes, in particular, simply returns error pages for anything but the main landing page on a regular enough basis. Of course, my observed public IP changes so it's not consistent, but it's genuinely annoying.
Could cloudflare legally charge them a bribe to captcha their users less? It isnt good to have a company in this position of power if so.
Why are you using Starlink at all if you have other options?
Edit: spelling
Starlink is at least 10 better (fewer captchas).
I'm really hoping cloudflare gets busted for having backroom deals with big ISPs or something. (For instance, if the cgnat had a cloudflare CDN cache endpoint behind / accociated with it, I suspect the IP would be white listed.)
I told it before and tell it now again: Cloudflare is dividing the World between first and second/third World countries with their captchas. I call it discrimination of second/third World countries! If you are from US and Europe you will never notice it but if you travel a little bit more you see these blocking captchas everywhere.
I do these things as well. It’s been months since I’ve seen a CloudFlare challenge page.
You would be surprised to see how easy it is to hack domestic routers.
1. Find and disinfect the devices, including the router. If you don’t have enough technical knowledge, then buy a new router.
2. Use 30 character long random password on the router.
3. Disable UPnP.
4. Anything with WI-FI and weak password can be hacked within minutes, so check your other devices as well, especially IOT ones.
Tarpitting (serving content slowly from the edge, in order to slow down bots) is necessarily one of the most expensive tools in a WAF/CDN's toolbox.
It's much more likely that something on his network is sending sketchy traffic to CF-fronted/Google sites, and the slow loading he's experiencing elsewhere is because his upstream is being saturated by whatever is happening on his network.
https://google.com/search?q=mikrotik+botnet
These things are the absolute scourge of the internet.
UPnP allows devices inside your network to open ports to the outside world without your knowledge. I think everyone should avoid it if they can get by without it
Spoiler alert: many websites simply refuse to load at all (e.g. any google service, and lots of websites "protected" by CF). Captchas are everywhere: in many cases, you can't even complete simple GETs of blogs without donating free labor to CF.
And the most infuriating part, you get CF marketing messages right in your face while your browser is calculating hashcash (I guess?)... At this point I can recognize every single one of them: something about bots making up 40% of all internet traffic, something about their web scraper protection racket, something about small businesses (???), etc etc...
To be fair, Tor exit nodes have an awful reputation for sure. Nevertheless, I have a hard time forgiving how CF makes browsing the Internet hell for those who actually need Tor.
Yeah, there's something amazingly aggravating about CF telling you how much traffic is bots while showing that they can't distinguish you from a bot.
The fact that humans are seeing the traffic meant for bots is an unfortunate side-effect.
I personally welcome our future bot overlords (not only because being unwelcome might be unhealthy for me — why would I publicly disagree with an overlord or not want to be their friend?).
I know they haven't done anything like that yet. But the technical capability is there, and we all know how short is the distance between technical capability and doing it, when the appropriate pressure is applied. So I wonder, how long before activists start demanding for CF to boot people from the internet, and how long before CF caves in to that...
Fact-less conspiranoia.
The CIA has the operators, equipment, and info to be able to kill almost any US citizen in a couple of hours for arbitrary reasons. How many times have they done it?
You are overweighing how much technical capability factors in and very much underweighing the costs of doing something like that. Opportunity costs, collateral damage, unintended consequences, reputation costs, brand harm.
Hell even ethics and morals of those involved. Who do you know would want to work for a company that did that? Who do you know would program that feature and not say anything about it? Why do you believe that CloudFlare would have so many of those kinds of people working there, but you know so few?
Why not make the same complaint about your ISP, your hardware manufacturer, your OS manufacturer? You have exactly the same amount of evidence they are doing this or could do this.
Remember that US criminal system attributes 3 elements to a crime: {means, motive, and opportunity} and even then we use evidence and an assumption of innocence. You just threw out every part except “means”.
I’m not defending CloudFlare here so much as tired of conspiracy theories and paranoia and social panics. We have enough of those things right now.
Pretty much anyone who works for Twitter, Facebook, Google, Paypal, Venmo, Amazon, Microsoft, Gofundme, Mailchimp, Tiktok, Reddit, Nextdoor, and many other tech companies routinely engaging in censorship and unpersoning. The idea that people in tech are some kind of high morals freedom lovers that would never work for a company that censors doesn't suffer even minimal scrutiny. If anything, they'd refuse to work for a company that doesn't censor enough - Twitter workers were in utter screaming panic when they thought Musk could but Twitter and relax the censorship a bit. So if anything you just disproven your own argument - maybe what will force CF to censor is not external pressure but the internal one. I don't see why Cloudflare workers would be any better than Twitter ones.
How would we know?
I love how people reflectively answer with cries of "no evidence!" to something that presents the evidence about exactly the thing they are claiming has no evidence. I get a distinct impression that the only person they're trying to convince is themselves, by self-hypnotically denying the reality in public.
There's a fact of CF booting sites, there's a fact of CF having IP blacklist, there's a fact of getting into IP blacklist being a very frustrating experience, there's a fact of various activists itching to make the lives of their political enemies a very unpleasant experience and launching successful pressure campaigns to do exactly that.
Did that happen with CF and IP blocking? No, I explicitly said it didn't, at least - I don't know any cases of it. But there's a lot of facts confirming there's a capability and motivation to do so. You may not believe it would happen, and you have a right to believe so, but when you are denying known facts, I don't think your beliefs are based on anything but wishful thinking. Your argument would be strong if you showed that, despite the known facts, it still couldn't happen. But instead to claim it couldn't happen you have to deny the facts.
> How many times have they done it?
Probably more than I know, but it's too big to bother with me, so I'm not too concerned about it right now. Maybe if I was in the same business as Assange, I'd be worried more.
> very much underweighing the costs of doing something like that.
Like what costs? You mean to say, no major provider would dare to boot the person from the Internet? Like Facebook, Twitter, Paypal, Venmo, Gofundme, Google, Amazon, Microsoft, Mailchimp, Tiktok, etc. would not dare to block people for political dissent and expressing unpopular opinions? Because, you know, opportunity costs, collateral damage, unintended consequences, reputation costs, brand harm. That' just couldn't happen. All that is fact-less conspiranoia.
> Why not make the same complaint about your ISP, your hardware manufacturer, your OS manufacturer
I can buy different hardware. I can install different OS. With some effort, but I can connect to a different ISP. Any of that won't help if Cloundflare would refuse to talk to me.
> Remember that US criminal system attributes 3 elements to a crime
Oh, but that's not a crime. That's the beauty of it - remember, it's a private action of a free enterprise, and you have no rights there. And even if the government would hold weekly meetings with Cloudflare suggesting them who exactly needs to be banned, it's still free enterprise, right? I mean, excluding the fact that the government would never do something like that, because reputation costs, brand harm, etc. That's another instance of fact-less conspiranoia, of course.
> I’m not defending CloudFlare here so much as tired of conspiracy theories and paranoia and social panics.
That's nothing. Imagine how tired you'd be when it turns out everything you thought is "paranoia" is actually happening. Of course, it would never happen to you - you'd never disagree with the government, or any people in power, or voice any unpopular opinions in public, would you now?
I can understand people's gripes about things on the free/cheap packages, where Cloudflare makes decisions for you, sometimes ones you don't like.
But as an enterprise customer, I've never found it to be anything short of fantastic - I can tailor it to behave exactly how I want, and not interfere with my customers.
Or are you suggesting that if you're having trouble visiting sites because of Cloudflare, you should become an enterprise customer? (slightly sarcastic, but not completely)
Once the IP address I don't own is released and assigned to some other router how do you think CF determines the new IP address for the individual/home? Unless this person is running the CF Dynamic DNS service which gives CF the IP address, I'm not sure CF would have any reasonable validation techniques to determine who is what given the size of residential networks.
[1] https://www.derstandard.at/story/1389860104020/eu-gerichtsho...
1) refuse to take responsibility for content they host by claiming they don't host
2) discriminate against huge parts of the Internet with no publicly known rules, nor methods to change that discrimination
3) make the abuse reporting process intentionally difficult and time-consuming
4) want to aggregate all the DNS data they can by making a deal with Firefox to turn on DNS-over-https by default without asking or even informing end users
5) want to re-centralize the Internet, in part so they can mix bad actors with good, in ways that make blocking next to impossible
How many of them do the discrimination we're all writing about here?
I get that this sucks for the end user, but I wonder how much we should blame Cloudflare vs the wider systemic challenges of managing DDOS protection on the web.
I'm guessing: having an IP address close to (or outright reused from and thus identical to) someone malicious, whom you know nothing about.
Another cause could be that you have some malware on your network that is attacking sites.
The author could be infected with something that wakes up sporadically (perhaps on external command) to participate in an attack, and then returns to dormancy. So the Cloudfare block being lifted isn't necessarily evidence that all is well.
It’s against copyright laws too, unless you get the right holders’ go-ahead first.
Some regional differences, but it’s mostly not allowed with a few exceptions for some institutions.Some VPN exit addresses have obviously been flagged as "bad" by Cloudflare and I get challenged with CAPTCHAs from some countries. It's an interesting experience, but luckily my VPN provider has enough exits that I can usually switch to one that has better reputation with Cloudflare.
Obviously, none of this is helping the internet be a better place from my point of view. I get that it's part of the ongoing fight against bots and spam, but it always feels so arbitrary. IP addresses are interchangeable, folks - they say nothing about the nature of the request. Or rather, for a large majority they do, but there's us minority that don't obey those rules and resent getting caught up in it.
They have the unsupervised power to destroy people's life.
But for me, the biggest problem is not companies from this size doing this, nowadays it is completely expected; the biggest problem are people, including developers, that think they are right and have the right to do it.
If not, figure out how to get a new one and see if the blocking recurs. If it does, the bad activity is probably coming from inside the house -- or CloudFlare has a way to identify you across an IP change.
PS Proud user of Firefox + resistFingerprinting=true PPS Ain't nothing better than CF guard page constantly-reloading on 20% of sites if you open some url :( No, fella, you first have to open the root '/' page so that guard page finally can either pass me through or show the cloudflare captcha. Ugh. Progress, they say.
1. Packet routing
In other words, I wish services like Cloudflare were made illegal.
I'm from Hong Kong and suspect the whole territory is on the naughty list.
I have absolutely no sympathy for website owners who are depending on a free service.
Why would anyone need to know what they did "wrong"?^1
That would mean they could correct their choice of software and usage patterns to conform with what Cloudflare believes is "right".
IMO, anti-"bot" (anti-automation measures) can effectively identify
(a) computer users with something of value to offer to website operators (usually, that means personal data/metadata at no cost), as distinct from
(b) computer users who
(i) do not send personal data to and/or generate metadata for website operators that indicates the user has something of value, and/or
(ii) are not using the software preferred by website operators (usually, that means software that requires interactive use that generates behavioural metadata for website operators).^2
The measures taken by Cloudflare presume any automation by computer users is "wrong".^3 Meanwhile, websites and CDNs are free to use automation however they wish.Of course automation can be used in a way that poses threats to websites. It can also be used in ways that do not pose any threats. The "protection" measures used by Cloudflare cannot distinguish between the two. IMO, these measures are deemed acceptable by website operators (Cloudflare's customers) because computer users blocked accidentally are more likely to be in catogory (b) not (a).
A more egalitarian approach IMO would be to publicise detailed rules for website usage. That is, provide an explanation of what the website operator/CDN considers "wrong". For example, a list of permitted software, the maximum allowable number of requests in a given time period, etc. IMO, the rules would likely be incriminating. For example, they might be discriminatory and/or anti-competitive and subject to legal challenge.
1. Another interesting question is why the website operator/CDN believes it is "wrong". It appears the OP's www usage is not posing any "threat" to Cloudflare's customers or partners. As such, there is no justifcation for blocking the OP.
2. For example, software that sends personal data to website operators, software that prominently displays advertising, software that provides "payment handlers", and so on.
3. Forcing computer users to choose "interactive" software that is generally unsuitable for automation, such as popular web browsers or mobile apps.
However, this does not apply if:
"is necessary for entering into, or performance of, a contract between the data subject and a data controller;"
Cloudflare would therefore perhaps claim that this is "necessary".
[1] https://cloud.google.com/blog/products/identity-security/how...