Optical Character Recognition (OCR) System

An Optical Character Recognition (OCR) System is a pattern recognition system that implements an OCR algorithm to solve an OCR task (convert optical characters into computer characters).

AKA: Image-to-Text Conversion System.
Context:
- It can be integrated into an Image Processing System.
- It can range from being a Rule-based OCR System to being an AI-powered OCR System.
Example(s):
- ABBY FineReader System (for Windows OS).
- InftyReader System (for Windows OS).
- Tesseract Software System.
- OnlineOCRnet System [1].
- Rossum AI (https://rossum.ai/).
- …
Counter-Example(s):
- a Speech Transcription System.
- a Text-to-Handwriting System.
See: OCR Algorithm, PDF-to-Text System, Computer Vision, Electronics, Machine, Image, Data Entry, Bank Statement, Cognitive Computing, Machine Translation, Text-to-Speech, Text Mining, Artificial Intelligence.

References

2020a

(Wikipedia, 2020) ⇒ https://en.wikipedia.org/wiki/Optical_character_recognition Retrieved:2020-2-17.
- Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example from a television broadcast). Widely used as a form of data entry from printed paper data records – whether passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation – it is a common method of digitizing printed texts so that they can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as cognitive computing, machine translation, (extracted) text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision. Early versions needed to be trained with images of each character, and worked on one font at a time. Advanced systems capable of producing a high degree of recognition accuracy for most fonts are now common, and with support for a variety of digital image file format inputs. Some systems are capable of reproducing formatted output that closely approximates the original page including images, columns, and other non-textual components.

2020b

(OnlineOCR.net, 2020) ⇒ https://www.onlineocr.net/service/about Retrieved: 2020-03-01.
- QUOTE: OnlineOCR.net is a free web-based Optical Character Recognition software (OCR) that allows you to convert scanned PDF documents (including multipage files), faxes, photographs or digital camera captured images into editable and searchable electronic documents including Adobe PDF, Microsoft Word, Microsoft Excel, Rtf, Html and Text.

2016

(Wikipedia, 2020) ⇒ https://en.wikipedia.org/wiki/Optical_character_recognition#OCR_Tools
- Examples of OCR tools are provided by Google,^[1] ABBYY, Adobe Acrobat, and ScanSnap. These software can scan an image and extract words from the document. For any project related to paperless offices, the use of OCR tools will be required to achieve the objectives of paperless offices and homes.

↑ Schaeffer, Jaron (June 22, 2010). "Google Drive Blog: Optical character recognition (OCR) in Google Docs". drive.googleblog.com. https://drive.googleblog.com/2010/06/optical-character-recognition-ocr-in.html. Retrieved April 11, 2016.

[1] Schaeffer, Jaron (June 22, 2010). "Google Drive Blog: Optical character recognition (OCR) in Google Docs". drive.googleblog.com. https://drive.googleblog.com/2010/06/optical-character-recognition-ocr-in.html. Retrieved April 11, 2016.

[1]

Optical Character Recognition (OCR) System

References

2020a

2020b

2016

Navigation menu

Search