You are on page 1of 2

Decaptcha using OCR (Optical character recognition) with c#

at: 01/04/2012 4:24:00 SA by: admin (views: 66 rss) tit kim c vi cent cho mi ci captcha khi xi dch v online vt, ti pi l m mt 1 ngy quy c n. Tham kho nhiu source code v cc loi bnh lun nhng tm li cch n gin nht v cng hiu qu nht vi problem ca ti l ci ny: Hng sn ca ms nm trong b ms office 2007 Microsoft Office Document Imaging 12.0 Type Library v y l dll ca n http://badpaybad.info/upload/25/Interop.MODI.dll.zip v y l demo http://badpaybad.info/fun/SearchGooglePaging.aspx di y l source code -- rt nhiu trng hp than phin l ci MODI.Document gi hm create b li hoc khi gi hm OCR (Optical character recognition)http://en.wikipedia.org/wiki/Optical_character_recognition l do l nh dng to Document ko pi l raw Bitmap, nh hn tp gm alpha channel, nhiu mu sc, trong sut ... th nn x l = cc step sau :)) 1. upload nh ln u (nh gc dng nhn dng ch) 2. chuyn nh v dng ton mu en v trng (color full -> gray scale -> black and white) 2.5 (optional) bn kim tra xem nh c ln qu hoc b qu ko -> resize i cht 3. v li ci nh en trng vi format nh PixelFormat.Format32bppRgb , thick th crop nh 1 xu (b border ca nh) 4. dng MODI ca MS ly ch --> done System.Drawing.Image imgInput=......; var tempBlackWhiteImagePathFile="c:/temp.bmp"; var src = new Bitmap(imgInput, imgInput.Width, imgInput.Height); #region to gray scale for (int i = 0; i < src.Width; i++) { for (int x = 0; x < src.Height; x++) { Color oc = src.GetPixel(i, x); int grayScale = (int)((oc.R * 0.3) + (oc.G * 0.59) + (oc.B * 0.11));//calkulate gray scale Color nc = Color.FromArgb(255, grayScale, grayScale, grayScale);//remove alpha channel src.SetPixel(i, x, nc); } } #endregion #region redraw var bmp = new Bitmap(src.Width, src.Height , PixelFormat.Format32bppRgb); var g = Graphics.FromImage(bmp); g.FillRectangle(new SolidBrush(Color.White), 0, 0, bmp.Width, bmp.Height); g.DrawImage(src, new Rectangle(0, 0, bmp.Width, bmp.Height) , new Rectangle(5, 5, bmp.Width -10, bmp.Height-10 ), GraphicsUnit.Pixel); bmp.Save(tempBlackWhiteImagePathFile, ImageFormat.Bmp);

#endregion #region recognize text var doc = new MODI.Document(); doc.Create(tempBlackWhiteImagePathFile); doc.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true); // MODI.Image img = (MODI.Image) imgInput; var img = (MODI.Image)doc.Images[0]; var layout = img.Layout; var text = layout.Text; doc.Close(false); #endregion

You might also like