Early systems required "training" (essentially, the provision of known samples of each character) to read a specific font. Currently, though, "intelligent" systems that can recognize most fonts with a high degree of accuracy are now common. Some systems are even capable of correctly identifying columns and non-textual images and producing output that places the text and scanned images equivalently.
The United States Postal Service has been using OCR machines to pre-sort mail since 1965. Mail sortation plays a small role in OCR research; OCR systems need only read the zip code (postal code) on each envelope. After the zip code has been read, a barcode with the same information is printed on the envelope. Envelopes marked with the machine readable barcode may then be processed; machine readable codes can be decoded more quickly than human readable letters and numbers.
Whilst the accurate recognition of European typewritten text is now considered largely a solved problem, recognition of handwriting in general, and printed versions of some other scripts--particularly those with a very large number of characters--are still the subject of research.
See also:
Search Encyclopedia
|
Featured Article
|