Jay M Patel 170010111012 RASPBERRY PI RASPBERRY PI Raspberry Pi is the name of a series of single-board computers made by the Raspberry Pi Foundation. Raspberry Pi was launched in 2012, and there have been several iterations and variations released since then. The original Pi had a single-core 700MHz CPU and just 256MB RAM, and the latest model has a quad-core CPU clocking in at over 1.5GHz, and 4GB RAM. Raspberry Pi is a very cheap computer that runs Linux, it also provides a set of GPIO (general purpose input/output) pins, allowing you to control electronic components for physical computing and explore the Internet of Things (IoT). RASPBERRY PI 3 LAYOUT RASPBERRY PI To operate on Raspberry pi one must install Raspberry Pi OS i.e. Raspbian on an SD-card of at least 8 GB size. After installation SD-card is mounter on Raspberry Pi. We just need to provide powers supply, using a Micro USB Power cable and connect Raspberry Pi with an USB Keyboard and Mouse, and Monitor . Then we are all-set to work on Raspberry Pi. OPTICAL CHARACTER RECOGNITION OPTICAL CHARACTER RECOGNITION WHAT IS OCR? As OCR stands for optical character recognition, OCR technology deals with the problem of recognizing all kinds of different characters. Both handwritten and printed characters can be recognized and converted into a machine-readable, digital data format. Consider any kind of serial number or code consisting of numbers and letters that is needed to be digitized. By using OCR we can transform these codes into a digital output. Put simply, the image taken is processed, the characters extracted, and are then recognized. OCR TYPES Optical character recognition (OCR) – targets typewritten text, one glyph or character at a time. Optical word recognition – targets typewritten text, one word at a time Intelligent character recognition (ICR) – also targets handwritten print script or cursive text one glyph or character at a time, usually involving machine learning. Intelligent word recognition (IWR) – also targets handwritten print script or cursive text, one word at a time. This is especially useful for languages where glyphs are not separated in cursive script. OCR OCR undergoes 3-steps during its execution i.e. 1) Pre-processing 2) Text recognition 3) Post-processing PRE-PROCESSING IN OCR OCR software often pre-processes images to improve the chances of successful recognition. The aim of image pre-processing is an improvement of the actual image data. In this way, unwanted distortions are suppressed and specific image features are enhanced. PRE-PROCESSING IN OCR OCR software often uses multiple pre-processing techniques in congestion to improve the chances of successful recognition. Pre-processing techniques includes: 1. De-skew If the document was not aligned properly when scanned, it may need to be tilted a few degrees clockwise or counterclockwise in order to make lines of text perfectly horizontal or vertical. 2. Despeckle Remove positive and negative spots, smoothing edges PRE-PROCESSING IN OCR 3. Binarization Convert an image from color or greyscale to black-and-white. The task of binarization is performed as a simple way of separating the text from the background. The effectiveness of the binarization step influences to a significant extent the quality of the character recognition stage. 4. Line removal Cleans up non-glyph boxes and lines. 5. Layout analysis or “zoning” Identifies columns, paragraphs, captions, etc., as blocks. Particularly useful in multi-column layouts and tables. PRE-PROCESSING IN OCR TEXT RECOGNITION IN OCR There are two basic types of core OCR algorithm 1. Matrix matching Matrix matching involves comparing an image to a stored glyph on a pixel-by-pixel basis; it is also known as "pattern matching“. This relies on the input glyph being correctly isolated from the rest of the image, and on the stored glyph being in a similar font and at the same scale. This technique works best with typewritten text. TEXT RECOGNITION IN OCR 2. Feature extraction Feature extraction decomposes glyphs into "features" like lines, closed loops, line direction, and line intersections. The extraction features reduces the dimensionality of the representation and makes the recognition process computationally efficient. These features are compared with an abstract vector-like representation of a character. General techniques of feature detection in computer vision are applicable to this type of OCR, which is commonly seen in "intelligent" handwriting recognition and indeed most modern OCR software. POST-PROCESSING IN OCR OCR accuracy can be increased if the output is constrained by a lexicon – a list of words that are allowed to occur in a document. This might be, for example, all the words in the English language, or a more technical lexicon for a specific field. To better deal with different types of input OCR, some providers started to develop specific OCR systems. These systems are able to deal with the special images, and to improve the recognition accuracy, even more, they combined various optimization techniques. USE CASES OF OCR OCR engines have been developed into a range of domain-specific OCR applications including receipt, invoice, check and the legal document. 1. Data entry for business documents, e.g. checks, passports, invoices, bank statements, and receipts. 2. Automatic recognition of license plate 3. In airports, passport recognition and information extraction 4. Extracting business card information into a contact list 5. Make numeric versions of huge printed document, e.g. book scanning 6. Converting handwriting in real-time to control a computer (pen computing) OCR USE IN BANKING The banking industry is a significant consumer of OCR along with other economic sectors such as insurance and securities. The most common use of OCR is to properly manage cheques: 1. The handwritten cheque is scanned 2. Its details are transformed into digital text 3. The signature is validated 4. The check is cleared in real-time All without human involvement. Although printing cheques have almost 100% accuracy (only the signature verification requires matching a pre-existing database), full autonomy for handwritten controls remains a long way to go. OCR USE IN BANKING THANK YOU