A Human Interactive Proof Algorithm Using Handwriting Recognition
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Topic based language models for OCR correction
Proceedings of the second workshop on Analytics for noisy unstructured text data
Exploratory analysis system for semi-structured engineering logs
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Hi-index | 0.00 |
Commercial forms-reading systems for extraction of data from forms do not meet acceptable accuracy requirements on forms filled out by hand. In December 1993, NIST called industry and research organizations working in the area of handwriting recognition to participate in a test to determine the state of the art in the area. A database of form images containing actual responses received by the US Census Bureau was provided. The handwritten responses are very loosely constrained in terms of writing style, format of response and choice of text. The sizes of the lexicons provided are very large (about 50000 entries) and yet the coverage is incomplete (about 70%). In this paper we discuss the approach taken by CEDAR to automate the task of reading the census forms. The subtasks of field extraction and phrase recognition are described.