You are on page 1of 4

2009 International Conference on Multimedia Information Networking and Security

A CAPTCHA Implementation Based on 3D Animation

Jing-Song Cui, Jing-Ting Mei, Xia Wang, Da Zhang, Wu-Zhou Zhang


College of Computer Science
Wuhan University
cuijs, mystline, 7341137, 182028431, 527042006@qq.com

Abstract—In order to distinguish between human users and CAPTCHA mechanism applications, prevailing in most
computer programs, CAPTCHA (Completely Automated login pages for users to sign up for new accounts, which
Public Turing test to tell Computers and Human Apart) could prevent spammers from using automated methods to
mechanism is widely applied in websites such as accounts register thousands of free online accounts, as shown in
application website. While the major implementation of Figure 1.
CAPTCHA method—2D still image verification code based on
OCR technology is threatened by developing artificial
intelligence and image recognition technologies. In this paper, Figure 1. Simple OCR_based CAPTHA
we propose a new approach to implement CAPTCHA
mechanism based on 3D Animation, utilizing the weakness of However, as OCR technology based on neural network
computer vision, which make it robust to computer attacks and
and artificial intelligence develops[9], the security of such
convenient for users to recognize, and implemented this
OCR-based CAPTCHA suffers more and more serious
method to generate a 3D animation verification code.
threats. Take some famous portal sites for example, Yahoo,
Keywords-CAPTCHA;VerificationCode;Moving objects; as the first user website of CAPTCHA mechanism, its
Three-dimensional Animation CAPTCHA systems used to prevent automated registration
of free Yahoo Mail accounts have been defeated by a
I. INTRODUCTION software released by a Russian researcher with the
Internet is crucial to each respect of life all over the globe recognition rate about 30%[6]. Microsoft live mail was also
nowadays, through which we could retrieve and exchange captured by the junk mail many times[7][8]. For that reason,
information freely and efficiently. Given the fundamental an increasing number of OCR-based CAPTCHA designers
relation between internet and people’ s life, vast malicious think out the solution that they add a large amount of
computer programs attack websites for profits, such as auto miscellaneous interference information into the image to
application for some mails’ accounts to send junk e-mails, reinforce verification’s robustness, which unfortunately leads
etc. CAPTCHA (Completely Automated Public Turing test difficulty for human to recognize the word, as shown in
to tell Computers and Human Apart) system emerges to Figure 2.
solve this problem by identifying end-users of internet
whether a real person or an automated computer Figure 2. Complex OCR_based CAPTCHA
program[1][2][3]. It also prevents malicious computer
program impropriating limited resources on internet and Non-OCR-based methods are based on the features of
maintains the security of internet. The key point of multimedia systems such as pictures and sounds and usually
CAPTCHA is the question that human users could get using methods like small puzzle games[4]. Such as PIX,
answers easily while the current computer programs could which ask users to select pictures of certain subject among
not afford the right answer yet. pictures of various subjects [1]. Besides this, there are a lot
CAPTCHA is first proposed by a study group of of methods, for example, CAPTCHA algorithm proposed by
Carnegie Mellon University and studied by researchers all Y. Rui is based on recognition of human face[10],
over the world after its appearance. Currently, CAPTCHA CAPTCHA technology developed by R. Datta is based on
methods are generally divided into two groups: OCR-based recognition of photos in daily life[11], CAPTCHA algorithm
methods and Non-OCR-based methods[2]. proposed by J. Elson is relied on judgment of the similarity
OCR-based methods provide images of words with of photos[12], etc.
distortion and various pictorial effects and ask users to type In this paper we propose a Non-OCR-based CAPTCHA
the words. Due to the presence of pictorial effects, computer method in the form of 3D animation and based on the
programs encounter problems in the recognition process and recognition of moving objects in videos. The rest part of this
only human users are capable of recognizing words easily. paper is organized as follows: In section 2, we analyze the
This sort of method is based on the weak points of optical crucial point of designing the new CAPTCHA mechanism
character recognition (OCR) programs that it is difficult to by concerning about the defects of current tracking moving
recognize reading texts printed with a low quality and objects methods, and then we describe the major generating
reading manuscripts[4], which brings to us possibility that step of this new CAPTCHA method. In section 3 we
certain form of images of word can be recognized only by illustrate the flow chart by implementing our method step by
human users but not by any OCR program in a certain extent. step. Section 4 concludes with a summary and comment on
Now, this kind of method has become the major pattern of future research.

978-0-7695-3843-3/09 $26.00 © 2009 IEEE 179


DOI 10.1109/MINES.2009.298
II. DESIGN and focus on the obvious moving objects with crucial
In the field of computer vision, there are some main information, so frame difference method properly would
algorithms on detection of moving objects in video, such as miss tracking target of animation. What’s more, via this
frame difference method[13][14], optical flow method[19], avenue of showing characters, it’s hard to detect the crucial
temporal difference algorithm[17][18], background moving objects with important information of CAPTCHA,
subtraction method[15][16], etc. While all of these because animation is full of similar moving objects. This
algorithms have their own defects in tracking moving objects, makes our method avoid the detection method such as
for example, optical flow method can’t do real-time temporal difference algorithm and background subtraction
monitoring well[15], template matching algorithm can’t tell method by taking advantages of their weak points listed
the difference between the target objects from the above. Meanwhile, intending to get information from 3D
interference objects when they are in similar shape[20], and animation requires real-time detection, while real-time
frame difference method may miss target objects or just monitoring is still the defect of optical flow method.
abstract a little part of target[21], background subtraction As we’ve discussed above, our new method of generating
method could not get right information when target objects 3D animation as the carrier of verification code could suffer
and background looks similar[13]. Moving objects current OCR attacks to 2D still image verification code and
recognition is more difficult to solve than optical character recognition attacks aiming at moving objects in computer
recognition nowadays in a great extent. It’s especially vision field.
difficult to track numerous real-time moving objects in a B. Algorithm of design
complicated background for computer[5], but easier for According to the main design idea of new CAPTCHA,
human users. Therefore, the implementation method of we design the flowing algorithms to implement our ideas, as
CAPTCHA based on the recognition of moving objects Figure 3 shows.
could utilize those defects to improve the safety of
CAPTCHA.
A. Main idea
In the design of 3D animation, we make continuous
frames to form a complete animation, which means that, we
need to design each frame picture and make them show one
by another to generate an animation. And the new kind of
CAPTCHA is still shown in the form of characters for users
to recognize, but characters are hidden in 3D animation. Figure 3. Algorithm of generating new CAPTCHA
Firstly, we determine the shown location of verification
code in the animation screen and the objects’ attributes like Step 1, we choose some elements from one or more sets,
colors, shapes in the animation. From each frame to another, such as English alphabet set or Arabic numerals set, to
the objects in animation are in movement, so we need to set consist the original set of the verification code, which should
each object’s moving orbits at first. Secondly, in the duration meet the needs of the human’s easy recognition of these
of objects’ moving, we compare the shown position of the elements from the original set.
verification code to the position of each moving object, if Step 2, we randomly select 3 or more elements from the
objects are in the position of supposed verification code’s original set to form the verification code and determine the
shown position, the corresponding moving objects will be position information of it in the animation screen expressed
changed in one attribute of themselves. in the way of pixel coordinate according to the size of the
In this way, the showing process of verification code is a animation screen we set.
dynamic process comparing to the showing method of 2D Step 3, the aim of this step is to draw a single frame of
still image verification code. Only when we see the moving the animation. The animation is generated by the consistent
process, we could get the right characters shown in the drawing of frames. At first, we set the number of drawn
animation. This is the zero knowledge per frame principle in frame 0.
our design, which means that each frame of the animation 1. When we draw the first frame in the animation,
would not leak any information of the verification code. The we need to decide the total number of the moving objects in
moving objects’ attributes changing is the key reason of the the frame, the initial position information of each object, and
showing of verification code in the animation, so only when the showing attributes such as the color, the size and the
we see the animation we could capture the content of it. This shape of the moving objects. What’s more, we should
principle assures that our method could resist the current determine every object’s moving track and select one
OCR attacks. As a result, computer programs could only attribute of their own as the changing attribute of all the
retrieve characters’ information by detecting moving objects moving objects. In the following frame’s drawing, we only
in animation. need to decide each object’s current position information by
In the field of computer vision, the current method also the position information in the prior frame and the moving
can’t help a lot. Firstly, because of the preset various moving track; others still the same as the prior frame.
orbit of each moving object and massive moving objects in 2. In each frame’s painting, we need to compare the
animation, it’s hard to get clear distinction between frames pixel coordinate of each object and the pixel coordinate of

180
the verification code. If they are the same, the object’s the position of elements in the matrix to the position of
changing attribute should be changed, which means that the pixels in the animation screen, in which way we can
current value of the object’s changing attribute should be determine the pixel coordinates of the verification code in
replaced by other elements except itself in the attribute’s the animation screen. For example, if the pixel coordinate in
source set randomly. the animation screen is (i, j), then the corresponding location
3. After one frame’s drawing, the drawn frame of the element in the matrix is the location of row (100-j)
number plus 1, the pixel data of this frame is reserved in the column i. As Figure 5 shows, this is number 7’s 01 matrix.
memory areas.
Step 4, we set the total number of frames of the
animation, and loop Step 3 until the drawn frame number
reaches the total number of frames of the animation. All the
pixel data of each frame is preserved in GIF format or AVI
format.
III. THE IMPLEMENTATION OF
THREE-DIMENSIONAL ANIMATION CAPTCHA
We implemented the new design utilizing VC++ and
OPENGL. We show the details in Figure 4.

Figure 5. 01 Matrix of “7”

Step 3, it’s the step to draw the frame of the animation. In


each frame, we used 150*150 points constituting a grid
pattern, which means that 22500 points work as the vertexes
of small quadrangles form a big quadrangle as a grid pattern.
The vertexes are expressed in the way of coordinates of X, Y
and Z, which determine the position of each quadrangle in
the grid pattern. The value of X and Y is a fixed, but the
value of Z is determined by a sin function. We randomly set
the color of each small quadrangle from the color set of red,
green, and blue. At last, we set color as the changing
attribute of this animation.
In the animation, after each 3 frames’ drawing, each
Figure 4. Example of generating a 3D animation verification code point’s value of coordinate Z is replaced by the adjacent right
point’s, and the points located in the rightmost column in the
Step 1, we choose some elements from English alphabet grid is replaced by the corresponding points’ value of
and Arabic numerals consisting the original source set of coordinate Z in the leftmost column in the same row.
verification code, except some elements that may confused
with each other in shape, in order to make the verification
code much easier to be recognized such as the number 0
and the letter O, their shape is similar. The set is as following
{A,B,D,E,F,P,Q,R,T,U,V,X,Y,H,J,K,L,M,N,3,4,6,7,8,9}.
Then three letters are selected randomly from the original
source set to form a verification code. We change the
verification code in form of matrix with elements of 0 and
Figure 6. Example of changing of points’ value of coordinate z
1in which the number of rows is 100 and the number of
columns is 240.The example is shown in Figure 5, which Let’s take an example, in Figure 6, when exchanging
shows the shape of number 7. The areas of element 1 in the happens, the Point 1’s value of coordinate Z is replaced by
matrix form the shape of a letter or a number as the area of the value of coordinate Z of Point 2. Because Point A, B, C
the verification code, and the areas of element 0 are outside and D are the points in the rightmost column in the grid
the verification code’s area. pattern, their value of coordinate Z should be replaced by the
Step 2, we set the screen size of the animation of 240 Point a, b, c, and d correspondingly.
pixels*100 pixels. With the reflection function, we change

181
Meanwhile, we set the rotation angle around X axis and REFERENCES
Y axis and Z axis to make the grid pattern in the animation [1] L. von Ahn,M. Blum, and J. Langford, Telling Humans and
seem like a motive wave, which makes the moving track Computers Apart Automatically, Communications of the ACM,
much more complex. February 2004, 57-60.
When this step is called the second time or more, the [2] , ,
Luis von Ahn Manuel Blum and John Langford Telling Humans
following measures should be taken. The total number of the and Computers Apart Automatically: How Lazy Cryptographers do
small quadrangles in the grid pattern, the moving track and , ,
AI In Communications of the ACM 2004.
[3] , , ,
Luis von Ahn Manuel Blum Nicholas J Hopper and John
the changing attribute set in the first frame’s drawing
remains unchanged, and the color of each quadrangle is the

Langford The CAPTCHA Web Page: http://www.captcha.net ,
2000.
same as the prior frame. We only should change the position [4] Mohammad Shirali-Shahreza and Sajad Shirali-Shahreza, Advanced
of each quadrangle by the moving track and the position in Collage CAPTCHA, Fifth International Conference on Information
the prior frame. Technology: New Generations, 2008, p1234-1235.
Step 4, we calculate the center point of each quadrangle [5] C Hue, JP Le Cadre, P Perez, Tracking multiple objects with particle
by the position information from Step 3 expressed in the way filtering, Aerospace and Electronic Systems, IEEE Transactions on,
of 3D coordinate(x, y, z), then change it to the 2D pixel Vol. 38, No. 3. (2002), pp. 791-812.
coordinate as ( i , j ). [6] Thomas Claburn, “Yahoo’s CAPTCHA Security Reportedly Broken”,
http://www.informationweek.com/news/internet/webdev/showArticle.
Step5, we judge the pixel coordinate of the position of jhtml?articleID=205900620..
each quadrangle’s center point whether or not in the pixel
coordinate regions of the verification code in the animation
[7] Microsoft Live Hotmail CAPTCHA-Hacked in 6 seconds ,“”
http://news.softpedia.com/news/Microsoft-Live-Hotmail-CAPTCHA-
screen. If it is, go to Step6; else, go to Step 7. Hacked-In-6-Seconds-83341.shtml.
Step 6, we change the color of the quadrangle to other [8] Yan, J. and Salah El Ahmad, A. A Low-cost Attack on a Microsoft
colors randomly, for example, if the color of the quadrangle CAPTCHA, In CCS'08. Proceedings of the 15th ACM Conference on
is red, then it may change to green or blue. Computer and Communications Security, Alexandria, Virginia, USA,
Step 7, according to the position and color of small October 27-31, 2008.
quadrangles, we draw the grid pattern in this frame. Then [9] K Chellapilla, K Larson, P Simard, M Czerwinski, Computers beat
humans at single character recognition in reading-based Human
there would be a loop, we call Step 3 to make the continuous Interaction Proofs”, 2nd Conference on Email and Anti-Spam
drawing of frames to produce an animation showing the (CEAS), 2005.
verification code in the screen. [10] Y. Rui and Z. Liu. ARTIFACIAL: Automated reverse turing test
Step 8, we record the animation for some time and using facial features. Technical Report MSRTR-2003-48, Microsoft,
reserved it as 3D animation CAPTCHA in the type of GIF, April 2003.
which would make it convenient to be utilized in the web [11] R. Datta, J. Li, and J. Z. Wang. IMAGINATION: a robust
page. image-based CAPTCHA generation system. Proc. of 13th ACM Int.
Conf. on Multimedia (MULTIMEDIA 05), pp. 331–334, November
IV. CONCLUSION 2005.
[12] J. Elson, J. R. Douceur, J. Howell, and J. Saul. ASIRRA: a
In our research, we have developed design of new CAPTCHA that exploits interest-aligned manual image categorization.
CAPTCHA and implemented it in one case. In this study we Proc. of 14th ACM Conf. on Computer and Communications Security
make our design of CAPTCHA in a new field—the (CCS 2007), pp. 366–374, October – November 2007.
recognition of moving objects, and we propose a new design [13] Gavrila D M, The visual analysis of human movement: a survey,
principle— zero knowledge per frame principle. In the field Computer Vision and Image Understanding, 1999, vol.73,p82-89.
of computer vision, the current major methods of detecting [14] Jia Deyun, Computer Vision, Beijing: Science publisher, 2000,
p26-235.
moving objects are not much practical yet, and our design
[15] Zhou Xihan, Liu Bo and Zhou HeQin, A Motion Detection Algorithm
make use of the defects of these methods to provide Based on Background Subtraction and Symmetrical Differencing,
difficulties for computer programs’ recognition of this new Computer Simulation, 2005, vol.22, p117-119.
CAPHTCHA. And the zero-knowledge per frame principle [16] McKenna S and Jabri Z Duric Z, Tracking groups of people,
makes it much more impossible for computer programs to Computer Vision and Image Understanding, 2000, vol(80), p42-56.
identify the content of CAPTCHA in each frame, so that [17] Tian Juan and Zheng Yuzheng,Application of template matching
computer programs using the OCR method which tackles technique in image recognition, Transducer and Microsystem
with current 2D still image verification code cannot solve Technologies, 2008, vol.27,p112-113.
this difficulty. [18] Xu Bo, Li Zhengming, A New Algorithm of Correlation Tracking
What’s more, this method is convenient for human to Based on Adaptive Template, Optics& Optoelectronic
Technology,2004, vol.2,p62-64.
recognize, which assures its practicality. How can make the
[19] Barron J, Feet D, Beauchemin S. Performance of optical flow
new CAPTCHA much safer to resist attacks and much easier techniques, International Journal of Computer Vision,
for human to recognize is still the main topic of our future 1994,vol.12,p43-77.
study. [20] Qin Xianxiang and Chen Hua, Amelioration of Template Matching
Arithemetic for Moving Target Recognition and Tracking, Journal of
V. ACKNOWLEDGEMENT Guangxi Academy of Sciences, 2008, vol.24, p293-295.
This work is supported by NSFC 60603012, NSFC [21] Wang Jianping, Liu Wei and Wang Jinling, A Moving Object
60703009. Detection and Recognition Method in Video Sequences, Computing
Technology and Automation, 2007, vol.26, p78-80.

182

You might also like