Welcome to Scribd!

(Fiot, Paucher) The Captchacker Project - Presentation

Uploaded by

0% found this document useful (0 votes)

60 views26 pages

Support Vector Machine (svm) has potential to break captchas. Simulation based method: Other easily segmentable captcha. Simulation-based method: easily segmentable non-segmented captcha's. Demonstration using home-made c++ prog based on openCV.

Original Description:

Original Title

[Fiot, Paucher] the Captchacker Project - Presentation

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Attribution Non-Commercial (BY-NC)

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

60 views26 pages

(Fiot, Paucher) The Captchacker Project - Presentation

Uploaded by

UdonCrazy Udonthani

Copyright:

Attribution Non-Commercial (BY-NC)

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 26

Search inside document

The Captchacker Project

http://code.google.com/p/captchacker

Jean-Baptiste Fiot & Rmi Paucher

You said captcha???

C.A.P.T.C.H.A = Completely Automated Program To Tell Computers and Humans Apart

Motivations

Do automatically these delightful activities:

Spam via email address creation Spam comment on blogs Website registration Bias online polls Brute force attack your sista or gf's MSN account

State of the Art

Geometric Detections Neural networks

Convolutional Neural Networks (Y. LeCun) Deep Belief Networks (G. E. Hinton)

Support Vector Machine

Our study

SVM's potential to break captchas Easily segmentable captchas Non-easily segmentable captchas

A word on SVM 1/2

Separate data via a hyperplane maximizing the margin When the data are not linearly separable

Higher dimension space AND/OR Allow outliers

A word on SVM 2/2

Choosing the error cost C

C too low => many outliers C too high => overfitting Demo

Easily segmentable captchas

Captchas from Egoshare.com Using home-made C++ prog based on OpenCV

Convert to gray-scale Threshold intensities Largest connected components (using best result using 4/8 connexity)

How to learn features?

Font based or simulation based method

Captcha based method

Simulation-based method
Font files

...................

SVM Model

Captcha based method 1/2

Initial Labelled Captcha Set (manually or using font-based models)

Preprocessed digits

Temporary SVM Model

Captcha based method 2/2

Automatically Labelled Captchas (using last computed model)

Enlarged Training Set (wrongly labelled captchas are manually corrected)

Improved SVM Model

Satisfying results ? YES

Final SVM Model

How to choose SVM Parameters?

Kernels

Radius-Based Function (RBF) Polynomial kernel Linear kernel Sigmoid kernel

Outlier penalization (error cost C) K-fold cross-validation

So u hack 'hem all or what???

Check this demo out!

Performance 1/4

Simulation based method:

Performance 2/4

Performance 3/4

Captcha-based method:

Performance 4/4

Other easily segmentable captchas

Nice results with a specific model

Portable method :)

Non-easily segmentable captchas

Automatic segmentation

We would like our algo to detect this:

What weve got

Preprocessed captcha Classifier

Prediction on a subwindow of any width Score telling how sure the prediction is
0 w

Good score Medium score Low score

i4 i3 i5 i6

i12 i9 i10 i11

i13 i14

Formalization
Optimization of sum (or product) of scores over all letters:

Resolution
Dynamic programming For each abscissa

Store the best paths of length 1-6 to go from 0 to the point

For each new segment [Ai, Bi]

Consider the concatenation of this segment with the optimal paths from 0 to Ai

The best path is the best 6-long path at point w ! Very fast (O(n.log(n)) complexity)

Results

Not so good

Very difficult to build a good model with many classes (high score clear character) Model has to be built with similar data than in the captchas Some characters are not well recognized Slow (score computation on each subwindow)

Conclusion

Fun project, with nice applications Special thanks to Iasonas Kokkinos http://code.google.com/p/captchacker

PRINCE2 Exam Answers
Document5 pages
PRINCE2 Exam Answers
bwsubbu
No ratings yet
Free Itil Foundation Exam Questions
Document13 pages
Free Itil Foundation Exam Questions
Antonio Vento
No ratings yet
USACO Training
Document12 pages
USACO Training
if05041736
No ratings yet
Linux Interactive Exploit Development With GDB and PEDA Slides
Document42 pages
Linux Interactive Exploit Development With GDB and PEDA Slides
vandilodonu
No ratings yet
Using GCC Auto-Vectorizer
Document15 pages
Using GCC Auto-Vectorizer
mono_neurone
No ratings yet
PYTHON Report
Document15 pages
PYTHON Report
irfan ali
No ratings yet
Liang-Barsky line clipping algorithm visualization
Document34 pages
Liang-Barsky line clipping algorithm visualization
naquash1983
No ratings yet
Code Simplicity: 7 Principles of Open Source Software Design
Document27 pages
Code Simplicity: 7 Principles of Open Source Software Design
Vaibhav Mule
No ratings yet
Urgent opening for C++/VC++ with C# Developers for Gurgaon with 3-6 years experience
Document13 pages
Urgent opening for C++/VC++ with C# Developers for Gurgaon with 3-6 years experience
Rahulsinghoooo
No ratings yet
Fortran 11.1 Et Abaqus 6.11
Document3 pages
Fortran 11.1 Et Abaqus 6.11
Mourad Taj
No ratings yet
ASIC DESIGN Verification
Document23 pages
ASIC DESIGN Verification
kiranvlsi
No ratings yet
Hacker2 Introduction Guide for Scanning, Cracking and Hacking Networks
Document7 pages
Hacker2 Introduction Guide for Scanning, Cracking and Hacking Networks
Слободан Урдовски
No ratings yet
Online Hotel Reservation System MSE Presentation
Document52 pages
Online Hotel Reservation System MSE Presentation
Sherry Ignacio
No ratings yet
Department of Compuetr Science and Engineering: Lab Manual Information Security
Document89 pages
Department of Compuetr Science and Engineering: Lab Manual Information Security
dkishore
No ratings yet
C PUZZLES, Some Interesting C Problems
Document17 pages
C PUZZLES, Some Interesting C Problems
Ajay Mehra
No ratings yet
WebRTC DevTools tips and tricks for performance minded developers
Document42 pages
WebRTC DevTools tips and tricks for performance minded developers
damarsan2
100% (1)
Sibin K Mathew
Document3 pages
Sibin K Mathew
sibinmathew
No ratings yet
Systemd Evolution Revolution Regression
Document18 pages
Systemd Evolution Revolution Regression
Susant Sahani
No ratings yet
The Daala Video Codec: Research Update
Document57 pages
The Daala Video Codec: Research Update
investtcartier
No ratings yet
Biblio Ann
Document8 pages
Biblio Ann
Hicham Hallouâ
No ratings yet
Customers Setup2
Document32 pages
Customers Setup2
Senthil Kumar
No ratings yet
Pratice Question
Document2 pages
Pratice Question
Kishlay Kumar singh
No ratings yet
Compiler Design
Document234 pages
Compiler Design
Shaaahid
No ratings yet
Pyfirmata PDF
Document9 pages
Pyfirmata PDF
Johanna Riascos Rivera
No ratings yet
Broadcom Test Questions
Document3 pages
Broadcom Test Questions
Somdyuti Paul
No ratings yet
Software Testing LAB Manual Imp
Document31 pages
Software Testing LAB Manual Imp
MuhammadNadeem
No ratings yet
Tugas Robotika 1. Program Webcam: AHMAD SYARIF (11501244004)
Document24 pages
Tugas Robotika 1. Program Webcam: AHMAD SYARIF (11501244004)
Ahmed Elsyarief
No ratings yet
2 PDF
Document36 pages
2 PDF
Ahmed Fareh El-thani
No ratings yet
C Puzzles: Dear Visitor
Document19 pages
C Puzzles: Dear Visitor
Zishan Cse
No ratings yet
Interesting Facts About Bitwise Operators in C - GeeksforGeeks
Document6 pages
Interesting Facts About Bitwise Operators in C - GeeksforGeeks
Gokul Chinnathurai
No ratings yet
OpenCL language for parallel programming across CPUs and GPUs
Document48 pages
OpenCL language for parallel programming across CPUs and GPUs
Abed Momani
No ratings yet
Ben Cohen - Using SVA For Scoreboarding and Testbench Design
Document4 pages
Ben Cohen - Using SVA For Scoreboarding and Testbench Design
Ishit Makwana
No ratings yet
Create A New File Using Gedit/vi Filename.c: Solutions To Lab Programs With Expected Input and Output
Document41 pages
Create A New File Using Gedit/vi Filename.c: Solutions To Lab Programs With Expected Input and Output
ADITYA GEHLAWAT
No ratings yet
Learn Test Automation & Start Using HP Quicktest Professional (QTP)
Document43 pages
Learn Test Automation & Start Using HP Quicktest Professional (QTP)
atulsinghchouhan
No ratings yet
1 Toolcourseenpptx
Document80 pages
1 Toolcourseenpptx
Feras Advertisements
No ratings yet
Thoughts On MVC and WD4A Observations
Document6 pages
Thoughts On MVC and WD4A Observations
ramakrishna_bojja
No ratings yet
Stepper Motor Interfacing With Microcontrollers Tutorial - Programming Stepper - Rickey's World of Microcontrollers & Microprocessors
Document5 pages
Stepper Motor Interfacing With Microcontrollers Tutorial - Programming Stepper - Rickey's World of Microcontrollers & Microprocessors
beto_m04
100% (1)
273 Partial Final Ans 2010 Fall
Document5 pages
273 Partial Final Ans 2010 Fall
Ji Wook Hwang
No ratings yet
Online Wedding Planning System ASP Net
Document28 pages
Online Wedding Planning System ASP Net
Parul Chawla
20% (5)
ABAP Interview Questions
Document148 pages
ABAP Interview Questions
Yellareddy_08
No ratings yet
Application 123
Document12 pages
Application 123
Hanzon jay Bantayan
No ratings yet
Beginning Python Programming: Kantesh Raj (@kanteshraj)
Document28 pages
Beginning Python Programming: Kantesh Raj (@kanteshraj)
sabar5
No ratings yet
Core Java Part2
Document74 pages
Core Java Part2
martinsdebo
No ratings yet
CrapKiller Manual Version
Document5 pages
CrapKiller Manual Version
boroman99
No ratings yet
Mobile Application Development With QML
Document31 pages
Mobile Application Development With QML
Iliya Anastasov
No ratings yet
C Assignment - Day 1 (14-02-2000)
Document1 page
C Assignment - Day 1 (14-02-2000)
deepti
No ratings yet
Simple+Linear+Regression+with+Jupyter+Notebook+by+Dr.+Alvin+Ang
Document16 pages
Simple+Linear+Regression+with+Jupyter+Notebook+by+Dr.+Alvin+Ang
kevin
No ratings yet
Filtro de Proteccion Contra Rastreo
Document464 pages
Filtro de Proteccion Contra Rastreo
Ezequiel CB
No ratings yet
Redhat Summit Perf Analysis and Tuning Part 2 2013
Document57 pages
Redhat Summit Perf Analysis and Tuning Part 2 2013
smallake
100% (1)
C programming puzzles explained
Document17 pages
C programming puzzles explained
nehakhandu7885
No ratings yet
C++ Class 12 Standard Calculator
Document8 pages
C++ Class 12 Standard Calculator
Nivesh Singh
No ratings yet
Btec HND in Business: Assignment Cover Sheet
Document8 pages
Btec HND in Business: Assignment Cover Sheet
Đỗ Tuấn
No ratings yet
Hacker Introduction
Document4 pages
Hacker Introduction
Слободан Урдовски
No ratings yet
Mulit Timeframe Strategy Tester
Document12 pages
Mulit Timeframe Strategy Tester
iinself
No ratings yet
Agile
Document26 pages
Agile
venkat2009
No ratings yet
08 Comp Sample Xi
Document3 pages
08 Comp Sample Xi
somenath_sengupta
No ratings yet
Oracle LogMiner Viewer and Flashback Transaction
Document21 pages
Oracle LogMiner Viewer and Flashback Transaction
Mohammad Zaheer
No ratings yet
Rajiv Resume
Document4 pages
Rajiv Resume
Sunkadahalli Govindaiah Bhanu Prakash
No ratings yet
Oracle Interfaces
Document9 pages
Oracle Interfaces
suneel725
No ratings yet
C Language Programming Codes
From Everand
C Language Programming Codes
Durgesh
No ratings yet
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
DESIGN ALGORITHMS TO SOLVE COMMON PROBLEMS: Mastering Algorithm Design for Practical Solutions (2024 Guide)
From Everand
DESIGN ALGORITHMS TO SOLVE COMMON PROBLEMS: Mastering Algorithm Design for Practical Solutions (2024 Guide)
ARCHER PAUL
No ratings yet
70 463 Demo
Document7 pages
70 463 Demo
Beyond_Good
No ratings yet
Virtual Networking Concepts
Document12 pages
Virtual Networking Concepts
Shahid Mahmud
100% (4)
15817
Document13 pages
15817
Beyond_Good
No ratings yet
Junior IT Consultant en Pol 07.2013
Document1 page
Junior IT Consultant en Pol 07.2013
Beyond_Good
No ratings yet
Project 1, Remote Procedure Call
Document20 pages
Project 1, Remote Procedure Call
Beyond_Good
No ratings yet
CDC in SQL Server 2012 SSIS - Allan Mitchell
Document18 pages
CDC in SQL Server 2012 SSIS - Allan Mitchell
Beyond_Good
No ratings yet
Exploration Routing Chapter 1 (Bupt)
Document59 pages
Exploration Routing Chapter 1 (Bupt)
Beyond_Good
No ratings yet
CCNA 3 v3.1 Module 7 Spanning Tree Protocol: © 2004, Cisco Systems, Inc. All Rights Reserved
Document22 pages
CCNA 3 v3.1 Module 7 Spanning Tree Protocol: © 2004, Cisco Systems, Inc. All Rights Reserved
Beyond_Good
No ratings yet
ITSM 7 5 Overview - Ext - 2009
Document49 pages
ITSM 7 5 Overview - Ext - 2009
Beyond_Good
No ratings yet
01 Implementing Active Directory Domain Services
Document32 pages
01 Implementing Active Directory Domain Services
Antonio Ricardo Ramírez
No ratings yet
Windows Administrator L1 Interview Question
Document7 pages
Windows Administrator L1 Interview Question
Beyond_Good
No ratings yet
Benefity Atos Eng
Document1 page
Benefity Atos Eng
Beyond_Good
No ratings yet
Columnstore Indexes For Fast DW QP SQL Server 11
Document8 pages
Columnstore Indexes For Fast DW QP SQL Server 11
nachiketai
No ratings yet
Windows Administrator L1 Interview Question
Document7 pages
Windows Administrator L1 Interview Question
Beyond_Good
No ratings yet
UnixCBT Feat. Solaris10 Notes
Document35 pages
UnixCBT Feat. Solaris10 Notes
Beyond_Good
No ratings yet
MS Learning Transcript
Document1 page
MS Learning Transcript
Beyond_Good
No ratings yet