Introduction

A CAPTCHA or Captcha is a type of challenge-response test used in computing to ensure that the response is not generated by a computer. The process usually involves one computer (a server) asking a user to complete a simple test which the computer is able to generate and grade. Because other computers are unable to solve the CAPTCHA, any user entering a correct solution is presumed to be human. Thus, it is sometimes described as a reverse Turing test, because it is administered by a machine and targeted to a human, in contrast to the standard Turing test that is typically administered by a human and targeted to a machine. A common type of CAPTCHA requires that the user type letters or digits from a distorted image that appears on the screen. The term "CAPTCHA" (based upon the word capture) was coined in 2000 by Luis von Ahn, Manuel Blum, Nicholas J. Hopper, and John Langford (all of Carnegie Mellon University). It is a contrived acronym for "Completely Automated Public Turing test to tell Computers and Humans Apart." Carnegie Mellon University attempted to trademark the term, but the trademark application was abandoned on 21 April 2008.

CAPTCHA: Telling Humans and Computers Apart Automatically A CAPTCHA is a program that protects websites against bots by generating and grading tests that humans can pass but current computer programs cannot. For example, humans can read distorted text as the one shown below, but current computer programs can't:

The term CAPTCHA (for Completely Automated Public Turing Test To Tell Computers and Humans Apart) was coined in 2000 by Luis von Ahn, Manuel Blum, Nicholas Hopper and John Langford of Carnegie Mellon University. A free, secure and accessible CAPTCHA implementation is available from the reCAPTCHA project. Easy to install plugins and controls are available for Word Press, Media Wiki, PHP, ASP.NET, Perl, Python, Java, and many

who in 1950 in his article “Imitation Game” described a test of a machine's capability to demonstrate intelligence. reCAPTCHA is our officially recommended CAPTCHA implementation. Theory Spambots are ubiquitous. What is CAPTCHA and how it works? If you're here on this site. are permanently trying to break the shields. It proceeds as follows: a human judge engages in a natural language conversation with two other parties. in their turn. Test Drive a CAPTCHA ReCAPTCHA. I. And yes. you authenticate yourself as a human by recognizing what object is common in a set of images. if the judge cannot reliably tell which is which. send SMS via webinterface. they just do all those things you expect from a real human. he (or it) has to accomplish a task. If the value entered does not match what . easy for human and hard or even impossible for robot. This situation gave a push to security specialists to create a number of protection methods against spambots. and spammers. This was the first example of a CAPTCHA based on image recognition. Human or robot? To determine whether a site visitor is a human or a robot. logging on to a web site. Instead of typing letters. reCAPTCHA also comes with an audio test to ensure that blind users can freely navigate your site. take part in online polls. These tasks are commonly named CAPTCHA. When. for example. They fill forms on web sites. the user is presented with a word or number in a distorted graphic image and asked to enter it. forum or blog. CAPTCHA is a form of reverse Turing test. A CAPTCHA script that's close to our hearts. It is named after Professor Alan Turing. Stop spam and help digitize books at the same time! The words shown come directly from old books that are being digitized. Our newest CAPTCHA! ESP-PIX. SQUIGL-PIX. then the machine is said to pass the test.other environments. possibly you're tired to death of unceasing spambot attacks on your site. It is assumed that both the human and the machine try to appear human. which is an acronym for “Completely Automated Public Turing test to tell Computers and Humans Apart”. this article with a real working example of CAPTCHA PHP script based on TheCAPTCHA can help you to solve the problem and protect your site against spambots. one a human and the other a machine.

This is the most easy to implement method. No man. because in order to be effective against spambot attacks you have to acquire a prohibitively huge database of images or questions. . A visitor has to type the heard text in the form. while a spambot can't. then the user is rejected. the most popular at the moment are: Image with random letters and digits. Image with some random mathematical expression. and in this article I'll teach you how to create a strong but simple PHP script to protect your web site. for example cats. but it may be impossible to answer for those who are not native speakers of the language. An audio file with pronounced random letters or digits. For example. On the other hand. it is quite secure if you follow security policies. which has to be calculated by a visitor. A random question or a riddle. what remains? A graphic CAPTCHA with letters and digits. sometimes with some noise over the text. for example only cats. sometimes distorted. because a spambot has to recognize the digits and operators and then to calculate the expression. A visitor has to select only objects of the kind that is specified by the test. a visitor has to put in the form the right answer. There are various implementations of CAPTCHA. a weak CAPTCHA with ineffective distortion and noise or even without any distortion at all is like a toy shield against a viking axe. A human can recognize distorted text. because a visitor has to have some audio-equipment on his computer (many workers in big corporations with strict computer policies don't have). has to be intimate with the language of the question. computers calculate much more faster than humans :) So.is expected. the sound is sometimes distorted. if expression is 47+39. Abordage Creating a graphic CAPTCHA is always a balance between possibilities of both humans and bots to recognize distorted and noisy text. if he is a human. You may think that math CAPTCHA is quite a good idea. a visitor has to tell the first name of Shakespeare. Which way is the best? The last and the next to last are the worst. For example. A set of images of various objects. If text is too much distorted there will be no submissions of forms on your site at all — this is not what we want. dogs and birds. An answer to a riddle can be easily added to spambot's database. which in this case is 86. The next disadvantage of these methods is that a visitor. Audio-CAPTCHA is also not the best way.

Then. sometimes filled with color. The second method is more powerful but much more complicated to implement at the same time. The first type is the weakest and there is no sense to use it. When there's no idea on where the symbol starts and where it ends. the task becomes almost impossible to accomplish for spambots. sometimes they form a kind of grid. Rectangles and/or circles. Lines. when above is done. so that's why symbols should always have different coordinates: The task is getting much more difficult if symbols on your CAPTCHA have always random position. that they will form a one box of symbols with no spaces between them. and in this article I will use this very method to show you how to make your own CAPTCHA script. angle and space. looks like an old grainy film or 3200 ISO images of your digital camera. As you can see. not by a spambot. However. so you have either to: put some noise under and over symbols or place each symbol so close to each other. when creating a CAPTCHA you have to forget anything systematic. If symbols are always on the same places with fixed spaces between them. sometimes of random color.How do spambots recognize symbols? This process can be divided into 2 steps: Definition of place and borders of each symbol. the next step is to compare color of background with the color of what is assumed to be a symbol. though still easy for a human being. as it only makes the symbols difficult to recognize by a human. The second and the third are good only when they are random. If these colors are different (and they must be different to be easily read by human). The right noise There are several types of noise commonly used with CAPTCHA scripts to embarrass the recognition of symbols by spambots: Pixel noise. The same method is used in my TheCAPTCHA script. . your CAPTCHA is weak. the first method can be quite secure from break. sometimes of random color and angle. recognition by spambots is quite easy. a spambot tries to recognize each symbol. Having done with the definition of place of each symbol. and finally distort the text. if the noise you put on your CAPTCHA is of the right kind.

he just excludes the grid from the image. Online Polls: In November 1999. IP addresses of voters were recorded in order to prevent single users from voting more than once. And note. The idea is to require users to solve a CAPTCHA before showing your email address.slashdot. Several companies (Yahoo!. etc. thus making it harder for spambots to define place and borders of each symbol. Up until a few years ago. most of these services suffered from a specific type of attack: "bots" that would sign up for thousands of email accounts every minute. lines of which have always the same regular angle. students at Carnegie Mellon found a way to stuff the ballots using programs that voted for CMU thousands of . If a creator of a spambot knows that on some particular image there is a regular grid. This is called comment spam. usually for the purpose of raising search engine ranks of some website (e. That's why you need to put this kind of noise under and over the symbols only on irregular basis. thus making this grid useless. The solution to this problem was to use CAPTCHAs to ensure that only humans obtain free accounts. As is the case with most online polls.org released an online poll asking which was the best graduate school in computer science (a dangerous question to ask over the web!). "buy penny stocks here"). Spammers crawl the Web in search of email addresses posted in clear text. that the color of these objects must be either the same as the color of the background or the same as the color of symbols. Applications of CAPTCHAs: CAPTCHAs have several applications for practical security. including (but not limited to): Preventing Comment Spam in Blogs: Most bloggers are familiar with programs that submit bogus comments.Any well-ordered structure on an image is a hole in your hauberk. In general. Microsoft. http://www. free services should be protected with a CAPTCHA in order to prevent abuse by automated script. Protecting Email Addresses From Scrapers. CAPTCHAs provide an effective mechanism to hide your email address from Web scrapers. By using a CAPTCHA.) offer free email services. The same with rectangles and circles. There is no need to make users sign up before they enter a comment. However. and no legitimate comments are ever lost! Protecting Website Registration:.g. A free and secure implementation that uses CAPTCHAs to obfuscate an email address can be found at reCAPTCHA MailHide. only humans can enter comments on a blog..

it is recommended that you use a CAPTCHA. doesn't guarantee that bots won't read a web page. Many implementations of CAPTCHAs use undistorted text. . since they usually belong to large companies. Guidelines: If your website needs protection from abuse. students at MIT wrote their own program and the poll became a contest between voting "bots. in order to truly guarantee that bots won't enter a web site. Image Security. some better than others. CMU's score started growing rapidly. CAPTCHAs are needed. CAPTCHAs must be accessible. please. CAPTCHAs also offer a plausible solution against email worms and spam: "I will only accept an email if I know there is a human behind the other computer. Can the result of any online poll be trusted? Not unless the poll ensures that only humans can vote." Search engine bots. it only serves to say "no bots. There are many CAPTCHA implementations." A few companies are already marketing this idea. Carnegie Mellon with 21. CAPTCHA images of text should be distorted randomly before being presented to the user. or text with only minor distortions. However. There is an html tag to prevent search engine bots from reading web pages. for example.times. respect web pages that don't want to allow them in. These implementations are vulnerable to simple automated attacks. Any implementation of a CAPTCHA should allow blind users to get around the barrier. The tag. The next day. however. CAPTCHAs based solely on reading text — or other visual-perception tasks — prevent visually impaired users from accessing the protected resource. Such CAPTCHAs may make a site incompatible with Section 508 in the United States.000." MIT finished with 21. Search Engine Bots: It is sometimes desirable to keep webpages unindexed to prevent others from finding them easily. This is better than the classic approach of locking an account after a sequence of unsuccessful logins. CAPTCHAs can also be used to prevent dictionary attacks in password systems. Worms and Spam. since doing so allows an attacker to lock accounts at will. The following guidelines are strongly recommended for any CAPTCHA code: Accessibility.032 and every other school with less than 1.156 votes. Preventing Dictionary Attacks. The idea is simple: prevent a computer from being able to iterate through the entire space of passwords by requiring it to solve a CAPTCHA after a certain number of unsuccessful logins. by permitting users to opt for an audio or sound CAPTCHA.

and the porn site users are asked to solve the CAPTCHA before being able to see a pornographic image. True CAPTCHAs should be secure even after a significant number of websites adopt them.. Most CAPTCHA scripts found freely on the Web are vulnerable to these types of attacks. Building a secure CAPTCHA code is not easy. as well as otherwise malicious programmers. In addition to making the images unreadable by computers. CAPTCHAs also offer welldefined challenges for the AI community. Whereas it is trivial to write a bot that abuses an unprotected site millions of times a day. CAPTCHAs are thus a win-win situation: either a CAPTCHA is not . the amount of damage this can inflict is tiny (so tiny that we haven't even noticed a dent!). such "CAPTCHAs" rely on the fact that few sites use them. Therefore. (2) Systems where a solution to the same CAPTCHA can be used multiple times (this makes the CAPTCHA vulnerable to so-called "replay attacks"). The "Pornography Attack" is Not a Concern It is sometimes rumored that spammers are using pornographic sites to solve CAPTCHAs: the CAPTCHA images are sent to a porn site. redirecting CAPTCHAs to be solved by humans viewing pornography would only allow spammers to abuse systems a few thousand times per day. An example of such a puzzle is asking text-based questions. There are various "CAPTCHAs" that would be insecure if a significant number of sites started using them.g. Common examples of insecurities in this respect include: (1) Systems that pass the answer to the CAPTCHA in plain text as part of the web form. using PHP. Since a parser could easily be written that would allow bots to bypass this test.Net) is a bad idea. The economics of this attack just don't add up: every time a porn site shows a CAPTCHA before a porn image. for instance. Security Even After Wide-Spread Adoption. is well beyond the capabilities of modern computers. making your own CAPTCHA script (e.Script Security. such as a mathematical question ("what is 1+1"). Advancing Artificial Intelligence: CAPTCHA tests are based on open problems in artificial intelligence (AI): decoding images of distorted text. to work on advancing the field of AI. We recommend that you use a well-tested implementation such as reCAPTCHA. as there are many failure modes. they risk losing a customer to another site that doesn't do this. Perl or . This is not a security concern for CAPTCHAs. and induce security researchers. Should I Make My Own CAPTCHA? In general. and thus that a bot author has no incentive to program their bot to solve that challenge. the system should ensure that there are no easy ways around it at the script level. While it might be the case that some spammers use porn sites to attack CAPTCHAs.

. or the CAPTCHA is broken and an AI problem is solved.broken and there is a way to differentiate humans from computers.

Sign up to vote on this title
UsefulNot useful