As a user I want the agent to be my full proxy. As a website operator I don’t want a mob of bots draining my resources.
Perhaps a good analogy is Mint and the bank account scraping they had to do in the 2010s, because no bank offered APIs with scoped permissions. Lots of customers complained, and after Plaid made it big business, eventually they relented and built the scalable solution.
The technical solution here is probably some combination of offering MCP endpoints for your actions, and some direct blob store access for static content. (Maybe even figuring out how to bill content loading to the consumer so agents foot the bill.)
IMO it's not worth solving anyways. Why do sites have CAPTCHA?
- To prevent spam, use rate limiting, proof-of-work, or micropayments. To prevent fake accounts, use identity.
- To get ad revenue, use micropayments (web ads are already circumvented by uBlock and co).
- To prevent cheating in games, use skill-based matchmaking or friend-group-only matchmaking (e.g. only match with friends, friends of friends, etc. assuming people don't friend cheaters), and make eSport players record themselves during competition if they're not in-person.
What other reasons are there? (I'm genuinely interested and it may reveal upcoming problems -> opportunities for new software.)
The sign up form only serves to link saved state to an account so a user could access game history later, there are no gated features. No clue what they could possibly gain from doing this, other than to just get email providers to all mark my domain as spam (which they successfully did).
The site can't make any money, and had only about 1 legit visitor a week, so I just put a cloudflare captcha in front of it and called it a day.
These situations will commonly be characterized by: a hundred billion dollar company's computer systems abusing the computer systems of another hundred billion dollar company. There are literally existing laws which have things to say about this.
There are legitimate technical problems in this domain when it comes to adversarial AI access. That's something we'll need to solve for. But that doesn't characterize the vast majority of situations in this domain. The vast majority of situations will be solved by businessmen and lawyers, not engineers.
Of course people can fake it, just as they fake other kinds of ID, but it would at least mean that officially sanctioned agents from OpenAI/etc would need to identify themselves.
I suggest you go ahead and make these; you'll make a boatload of money!
Proof of work? Bots are infinitely patient and scale horizontally, your users do not. Doesn't work.
Micropayments: No such scheme exists.
I will bet $1000 on even odds that I am able to discern a model from a human given a 2 hour window to chat with both, and assuming the human acts in good faith
Any takers?
So I think maybe that is a partial answer: anti-AI barriers being simply too expensive for AI spamfarms to deal with, you know, once the bottomless VC money disappears?
It's back to encryption: make the cracking inordinately expensive.
Otherwise we are headed for de-anonymization of the internet.
The entire distinction here is that as a website operator you wish to serve me ads. Otherwise, an agent under my control, or my personal use of your website, should make no difference to you.
I do hope this eventually leads to per-visit micropayments as an alternative to ads.
Cloudflare, Google, and friends are in unique position to do this.
While this is sometimes the case, it’s not always so.
For example Fediverse nodes and self-hosted sites frequently block crawlers. This isn’t due to ads, rather because it costs real money to serve the site and crawlers are often considered parasitic.
Another example would be where a commerce site doesn’t want competitors bulk-scraping their catalog.
In all these cases you can for sure make reasonable “information wants to be free” arguments as to why these hopes can’t be realized, but do be clear that it’s a separate argument from ad revenue.
I think it’s interesting to split revenue into marginal distribution/serving costs, and up-front content creation costs. The former can easily be federated in an API-centric model, but figuring out how to compensate content creators is much harder; it’s an unsolved problem currently, and this will only get harder as training on content becomes more valuable (yet still fair use).
Agree it will become a battleground though, because the ability for people to use the internet as a tool (in fact, their tool’s tool) will absolutely shift the paradigm, undesirably for most of the Internet, I think.
Hard for me to see how it’s ethical to force your customers to do tons of menial data entry when the orders are sitting right there in json.
I want to able to automate mundane tasks but I should still be confirming everything my bot does and be liable for its actions.
There's likely a correlate with AI here: If I run OpenTable, I wouldn't want my relationship with my customers to always be proxied through OpenAI or Siri. Even the App Store is something software businesses hate, because it obfuscates their ability to deal directly with their customers (for better or worse). Extremely few businesses would choose to do business through these proxies, unless they absolutely have to; and given the extreme competition in the AI space right now, it feels unlikely to me that these businesses feel pressure to be forced to deal with OpenAI/etc.
[1] https://www.cnbc.com/2025/07/28/jpmorgan-fintech-middlemen-p...
We got hit from human verifiers manually war dailing us, this is with account creation, email verify and captcha. I can only imagine how much worse it'll be for us (and Twilio) to do these verifications.
LLM models are much harder to drive than any website to serve, so you do not expect mob of bots.
Also keep in mind that this no interaction captchas use behavioral data that are collected in background. Plus you usually have sensitivity levels configured. depending on your use case you might want user proof not being a bot or it might be good enough to just not provide evidence for being one.
bypassing this no interaction captcha can be also purchased as a service, they basically (AFAIK) reuse someone else session for captcha bypass.
Just my 2 cents, obviously lawmakers and jurisdiction may see these issues differently.
I suppose there will be a need for reliable human verification soon, though, and unfortunately I can't see any feasible technical solution that doesn't involve a hardware device. However, a purely legal solution might work well enough, too.
If your site is not monetized by ads then having an LLM access things on the user's behalf should not be a major concern it seems. Unless you want it to be painful for users for some reason.
Human identity verification is the ultimate captcha, and the only one AGI can never beat.
No trouble at all. Barely an inconvenience.
This is both inevitable already, and not a problem.
When we do that, it opens up solutions which are far more privacy conscious and resistant to abuse. (For example, being blocked from signing up for new accounts because somebody in the federal government doesn't like an op-ed you wrote.)
And then there's Worldcoin, which is universally hated here.
There are some very real and obvious downsides to this approach, of course. Primarily, the risk of privacy and anonymity. That said, I feel like the average person doesn't seem to care about those traits in the social media era.
That, to me, seems like it could be the foundation of a new web. Something like:
* User-agent sends request for such-and-such a URL.
* Server says "okay, that'll be 5 tokens for our computational resources please".
* User decides, either automatically or not, whether to pay the 5 tokens. If they do, they submit a request with the tokens attached.
* Server responds.
People have been trying to get this sort of thing to work for years, but there's never been an incentive to make such a fundamental change to the way the internet operates. Maybe we're approaching the point where there is one.
I would maybe go in the direction to say that the wording “I’m not a robot” has fallen out of time.
as for a solution its the same for any automated thing u dont want. (bots / scrapers). you can implement some measures but are unlikely to 'defeat' the problem entirely.
as a server operator you can try to distinguish stuff and the users will just find ways around your detection of if its an automation or not.
Bot: one press of the trigger => automatic firing of bullets
so charge for access. If the value the site provides is high, surely these mobs will pay for it! It will also remove the mis-incentives of advertising driven revenues, which has been the ill of the internet (despite it being the primary revenue source).
And if a bot misbehaves by consuming inordinate amounts of resources, rate limiting them with increasing timeouts or limits.
We put a captcha there, because without it, bots submit thousands of spam contact us forms.
I have to guess that there are people in this boat right now, being disabled by these things.
I’ve seen this in past and present. Google’s “click on all the bicycles” one is notoriously hard, and I’ve had situations where I just gave up after a few dozen screens.
Chinese captchas are the worst on this sense, but they’re unusual and clearly pick up details which are invisible to me. I’ve sometimes failed the same captcha a dozen times and then saw a Chinese person complete the next one successfully on a single attempt, on the same browser session. I don’t now if they measure mouse movement speed, precision, or what, but it’s clearly something that varies per person.
It is hard because you need to only find the bicycles people on average are finding.
A few dozen?? You have much more patience than me. If I don't pass the captcha first time, I just give up and move on. Life is too short for that nonsense.
Hollywood has gotten hate mail since the 70s for their lack of science research in movies and shows. The big blockbuster hits actually spent money to get the science “plausible”.
Sidney Perkowitz has a book called Hollywood Science [0] that goes into detail into more than 100 movies, worth a read.
[0] https://cup.columbia.edu/book/hollywood-science/978023114280...
https://en.wikipedia.org/wiki/Fruit_machine_(homosexuality_t...
The stakes for men subjected to the test were the loss of their livelihoods, public shaming, and ostracism. So... Blade Runner was not just predicting the future, it was describing the world Philip K. Dick lived in when he wrote "Do Androids Dream of Electric Sheep" in the late 1960s.
As you say, they are also getting increasingly difficult. Click the odd one out, mental rotations, what comes next, etc. - it sometimes feels like an IQ test. A new type that seems to be becoming popular recently is a sequence of distorted characters and letters, but with some more blurry/distorted ones, seemingly with the expectation that I'm only supposed to be able to see the clearer ones and if I can see the blurrier ones then I must be a bot. So what this means is for each letter I need to try and make a judgement as to whether it's one I was supposed to see or not.
Another issue is the problems are often in US English, but I'm from the UK.
It could also be that everything was working as intended because you have a high risk score (eg. bad IP reputation and/or suspicious browser fingerprint), and they make you do more captchas to be extra sure you're human, or at least raise the cost for would-be attackers.
And the reason for stranding is probably because the AI crew on it performed a mutiny.
One early example of this line of thinking: https://world.org/
Skyrocketing complexity actually puts the web at risk of disruption. I wouldn’t be surprised if a 22 year old creates a “dumb” network in the next five years—technically inferior but drastically simpler and harder to regulate.
Long time ago I saw a post where someone running a blog was having trouble keeping spam out of their comments, and eventually had this same idea. The spambots just filled out every form field they could, so he added a checkbox, hid the checkbox with CSS, and rejected any submission that included it. At least at the time it worked far better than anything else they'd tried.
It worked almost 100% of the time. No need for a CAPTCHA.
This is not an advert, I only know about them because it was integrated with Invidious at some point: https://anti-captcha.com/
> Starting from 0.5USD per 1000 images
Source: I wrote the og detection system for hCaptcha
It's a very old service, active since 00s. Somewhat affiliated with cybercrime - much like a lot of "residential proxies" and "sink registration SMS" services that serve similar purposes. What they're doing isn't illegal, but they know not to ask questions.
They used to run entirely on human labor - third world is cheap. Now, they have a lot of AI tech in the mix - designed to beat specific popular captchas and simple generic captchas.
No agent will touch it!
“As a large language model, I don’t hack things”
AI agent: *intense sweating*
What seems to be a better CAPTCHA, at least against non-Musk LLMs is to ask them to use profanities; they'll generally refuse even when you really insist.
I have never used ChatGPT so no idea how its agent works, but if it is driving your browser directly then it will look like you. If it is coming from some random IP address from a VM in Azure or AWS even then the activity probably does not look "bot like" since it is doing agentic things and so acting quite like a human I expect.
In our logs we can see agentic user flow, real user flow and AI site scraping bot flow quite distinctly. The site scraping bot flow is presumably to increase their document corpus for continued pretraining or whatever but we absolutely see it. ByteDance is the worst offender by far.
I suspect that a LLM would be slower and more irregular as it is processing the page and all that, vs a DOM-selector driven bot that will just machine-gun its way through in milliseconds.
Of course, Cloudflare and Google et al captchas cant see the clicks/keypresses within a given webpage - they'll only get to see the requests.
It’s my agent — whether ai or browser — and I get to do what I want with the content you send over the wire and you have to deal with whatever I send back to you.
As long as it's not wrong/immoral/illegal for me to access your site with any method/browser/reader/agent, and do what I want with your response. Then I think it's okay to send a response like "screw you, humans only"
Paywalls suck, but the suck doesn't come from the NYT exercising their freedom to send whatever response they choose.
$ cat mass-marketer.py
from openai.gpt.agents import browserDriverThat said, I find it deeply satisfying to see LLMs solve CAPTCHAs and other discriminatory measures for "spam" reduction.
Solving an audio-only CAPTCHA with AI is typically way easier than solving some of the more advanced visual challenges. So CAPTCHA designers are discouraged from leaving any accessibility options.
And here's a secondary question: if firms are willing to pay an awful lot per token to run these things, and have massive amounts of money to run data centres to train AIs, why would they not just pay for a subscription for every site for a month just to scrape them?
The future is paying for something as a user and having limits on how many things you can get for your money, because an AI firm will abuse that too.
Given the scale of operations of these firms, there is nothing you can sell to a human for a small fee that the AI firms will not pay for and exploit to the maximum.
Even if you verify people are real, there's a good chance the AI firms will find a way to exploit that. After all, when nobody has a job, would you turn down $50K to sell your likeness to an AI firm so their products can pass human verification?
Combine wet into dry slowly until it feels like damp sand.
Pack into molds, press firmly.
Dry for 24 hours before using.
Drop into a bath and enjoy the fizz!
Bots have for a long time been better and more efficient at solving captchas than us.
Easily solves 99% of the web scraping problems.
I love this totally normal vision of computing these days. :)
I simply avoid any website that presents me with a Cloudflare CAPTCHA, don't know what the fuck they've done in their implementation but it's been broken for a long time.
And in many cases, it's taking a huge steaming dump upon a site's first-impression user experience, but AFAICT, it's not on the radar of UX people.
It's always a cat and mouse game.
It is beyond time we start to adress the abuses, rather than the bot/human distinction.
Half of the sites already block OpemAI. But if it is steering the user’s browser itself?
I wonder how these capabilities will interact with all the "age verification" walls (ie, thinly disguised user profiling mechanisms) going up all over the place now.
Maybe after sign up, biometric authentication being mandatory is the only thing that would potentially work. The security and offline privacy of those devices will become insanely valuable.
Anyone not authenticating in this way is paywalled. I don’t like this but don’t see another way.
I’m not using the web if I’m bombarded by captcha games… shit becomes worthless over night if that’s the case. Might as well dump computing on the Internet entirely if that happens.
The bit at the bottom might actually work on LLMs.