1. Simple mathematical question, e.g. "What do you get if you add five and three?" Answer is processed on the server.
2. Hidden form field that is supposed to remain blank.
3. Blacklist of common spam words.
We still get the occasional spammer but the real problem was our phpbb3 board showing up in the automated spam programs. As soon as we were slightly different than the default install, nearly all the spam stopped.
The interesting thing was that even the built-in captcha didn't stop the spam--it was worth cracking since everyone uses it.
On my blog I generate two random sequences of characters and tell the user to join them together without a space. This seems to have worked really well. (Though in the past I've also had static strings like "join 'bow' and 'ser' together" or "join 'doc' and 'tor' together".) I used to have the addition challenge like the GP but it was broken. My comment form was slammed with hits, so I rate-limited attempts, but a few still got through (since it's actually not a big set of responses to go through and you can defeat rate limits). That's when I implemented my string scheme and changed the comment form submission url (which only lives in Javascript now), haven't had a spammer get through yet.
On another forum I used to moderate (I think it was an Invision Powerboards one) I fixed it with a second field asking something like "What makes things fall down? gravity or noodles?" And if they entered gravity it would let them register. It lasted a few years, then a few randomly got in but by that time the forum had died.
Best CAPTCHA ever: http://random.irb.hr/signup.php
If your concern are only dumb, fully-automated bots not targeting your site specifically (which is true for the bottom 99.5% of the web) then you don't need CAPTCHA.
2 and 3 are great for non-targeted attack. 1 is a very weak protection against targeted attack and it's likely an overkill unnecessarily burdening users.
Unfortunately it wasn't allowed because the site owner pointed out that the market the site was aimed at had a reasonable number of people with connotative difficulties - ie, they struggled to follow multi-step instructions.
(Yes, this does mean that computers are able to solve a problem that is supposed to identify a human much better than some humans.)
Even my pre-school self could solve the Sesame Street "one of these things is not like the other".
There are so many sets with an odd-one-out that would only be easily determinable by a human over a computer.
http://www.wolframalpha.com/input/?i=What+do+you+get+if+you+...
Sure, a captcha of "lI0Ol1o" would would probably be unreadable to a computer ... but it would be to a human too.
We're quickly approaching the point that image recognition is getting as good at solving image captchas as humans are, and when we do, we'll need to find some other way to do it.
[1]: http://www.labnol.org/internet/favorites/cats-inside-rapidsh...
A computer can do statistical sampling of many CAPTCHAs generated by the same website, and then try to reverse-engineer the image munging algorithm.
Humans, OTOH will probably give up after 2 tries and already struggle to get |O0Il1l right.
Imagine picking letters with the right frequencies. Now, instead of doing that, pick pairs of letters, with the right frequency, so that each pair "chains" with the previous. If you have good pair frequency data, you can do longer than pairs and get even closer to English.
However having your own custom captcha probably helped quite a bit. I'm guessing spammers aren't going to bother writing custom software to decode your captcha unless you have a major site.
How come nobody adopted that approach?
1. Is it trivial for a human to answer correctly? This affects growth.
2. Can humans do it quickly? This affects growth.
3. How is the random guess-rate? This better be abysmal.
4. How good is the “opposing” technology?
5. How is the guess rate of a sophisticated attacker, using said technology?
6. How much human input is required to create your captcha? You better be asymptotically better than human-solving the captcha.
7. What are the cultural and accessibility issues?
I remember suggestions of using computing power to slow down guess-rates. Probably related to bitcoins. However, it doesn't work since some users don't seek better computer performance.
Any CAPTCHA scheme that can be solved by enumeration of all possible answers is a failure, because there are cost effective ways to hit a CAPTCHA over and over again, with cheap humans, and build the enumeration table. This is where the "pick the image with a cute thing" in it scheme falls down. In this case, once the enumeration of description -> image(s) is determined, you lose.
Any scheme that involves humans some how creating tags or labeling images or writing text will generally be enumerable as well, because they can trivially out-manpower you.
Also, many CAPTCHA schemes use a model of spammer in which the spammer isn't permitted to be clever. If there is a pattern, in the real world the spammer is "allowed" to exploit it. There are 2^64 different ways to add two 32-bit numbers to each other, but that doesn't mean that you can beat a spammer just by asking the user to do a simple addition, because when I say "enumerate" I mean it more in the computer science sense, not the literal sense. They can and will create something that parses the problem and does it, so for instance for my stupid "add two random 32-bit numbers" example the CAPTCHA is actually easier for a computer than a human.
CAPTCHAs are hard and getting steadily harder... at least, if you require them to work. Security theater is easy.
If the captcha is ANYTHING other than immediately obvious, a signficant number users will not be able to pass it.
An example would be https://sso.state.mi.us/som/dch/enroll/reg_page1.jsp (You can enter any fake name/email, this is only step one of the registration script. The next page has the captch in question.)
The captcha is plaintext, right on the page. The data from the captcha isn't even sent to the server, it is processed locally via JavaScript.
So, the bots don't even have to do anything, but humans have to input a meaningless number...
<input type="text" name="inputNumber" class="entry-field" size="5" tabindex="3">
<!-- ... -->
document.write('<div id="layerNum" class="verifyNumber" align="center">');
document.write('<b>'+str+'</b>');
document.write('<img src="generateGIF.jsp?number='+str+'">');
document.write('</div>');
document.write('<input size="5" type="hidden" name="rdNumber" value="'+str+'">');
<!-- ... -->
<input type="submit" value="Continue" name="submit" onclick="return Valid();">
<!-- ... -->
function Valid(){
// ...
if(chkRandomNumber()){
return true;
}else{
return false;
}
// ...
}
function chkRandomNumber(){
str1=document.all.rdNumber.value;
str2=document.all.inputNumber.value;
if(str1!=str2){
alert("Please check and type the number as shown in the box");
return false;
}else{
return true;
}
}The end result, spam disappeared and we didn't add much pain to our customers.
Most spammers likely don't go check every website to see how they can break the captcha, they just set up a script to go fill out forms and submit them.
They're solution, while not being the "awesome, technologically advanced solution", if it prevented spam, was a working solution without the complexity of actual captchas.
Furthermore, as captchas have been known to be broken, who's to say that the spammers tool doesn't recognize valid, commly-used captchas and break them automatically? As opposed to a field that says "Type the following word", which the spammers don't (can't easily?) check for.
Instead, it would seem they're taking the "we'll get hacked anyway, so let's not waste our time" approach.
It just indicates pathetic state of Sony Security development team - something that cannot be changed overnight.
As always, one of the most interesting part of truly great CAPTCHA systems is that they are advancing the state of the art in image recognition. But on the other hand we still have scams like this, and no real solutions.
Here is my CAPTCHA research paper:
http://news.ycombinator.org/item?id=2754436
http://www.slideshare.net/desaiguddu/drag-and-drop-captcha-a...
"You are born into WHAT? (answer is one english word)* [1]
It is not entirely clear to me what the expected answer is. A google search for "you are born into" does not return any answer that is clearly correct. If I had to guess I would go with "sin" but I am hoping that nobody would be so ignorant as to design a captcha system that assumes a certain cultural/religious background.