Optical Character Recognition
OCR stands for Optical Character Recognition. It is a technology used to convert images of typed, handwritten, or printed text into machine-readable text data. OCR is a common method of digitizing texts in print so that they can be edited, searched, stored compactly, and displayed for use in machine processes such as machine translation, online text to speech, text mining, and key data. This technology became popular in the early 90s during attempts to digitize newspapers. The early versions were slow and needed to be trained with images of each character. However, technology has evolved since then and is now capable of converting with an almost perfect level of accuracy.
Advanced OCR Tools
Some advanced OCR tools can be used to automate complex workflows that are document-based. Some can reproduce formatted outputs that are close to the original document including the columns, images, and non-textual components. There are various types of OCR technology such as optical word recognition, optical character recognition, intelligent character recognition, and intelligent word recognition. Optical character recognition is used to convert handwritten text, one character or glyph at a time, while optical word recognition converts typewritten text one word at a time.
Intelligent character recognition is used to convert handwritten print script or cursive text into machine-readable text data. It converts the text one character at a time and usually involves machine learning. Intelligent word recognition converts handwritten cursive or print script text one word at a time and is used for languages where the glyphs are not divided in cursive script.
Uses of OCR
OCR can be used for various purposes and has been developed into various types of applications that are domain-specific. They can be used for data entry for business documents such as cheques, receipts, bank statements, passports, and invoices. It is used in airports for information extraction and passport recognition. OCR technology is also used for automatic plate number recognition and traffic sign recognition. It is used in assistive technology for visually impaired or blind users. Pen computing also uses OCR technology as it converts the handwriting in real-time to control a computer. It is used in book scanning extract the text in printed documents more quickly or to make scanned documents searchable by converting them into searchable formats such as PDFs. OCR has different techniques such as pre-processing, text recognition, post-processing, and application-specific optimizations.
In pre-processing, the software usually pre-processes the images in order to increase the chances of successful recognition. Pre-processing techniques include de-skew, despeckle, binarization, zoning, line removal, script recognition, line, and word detection, and character isolation. In-text recognition, there are two basic types of core OCR algorithm, matrix matching, and feature extraction. In post-processing, the OCR increases the accuracy by constraining the lexicon of the output i.e a list of words allowed to appear in a document.
The OCR software can use its dictionary and co-occurrence frequencies to correct errors. OCR technology also has application-specific optimizations. Some OCR providers have modified OCR systems to efficiently deal with specific types of input such as license plates, screenshots, and ID cards.