Reading handwritten US census forms

Authors:
S. Madhvanath;V. Govindaraju;V. Ramanaprasad;D. S. Lee;S. N. Srihari
Affiliations:
-;-;-;-;-
Venue:
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1
Year:
1995

Citing 0
Cited 3

A Human Interactive Proof Algorithm Using Handwriting Recognition

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Topic based language models for OCR correction

Proceedings of the second workshop on Analytics for noisy unstructured text data
Exploratory analysis system for semi-structured engineering logs

DAS'06 Proceedings of the 7th international conference on Document Analysis Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Commercial forms-reading systems for extraction of data from forms do not meet acceptable accuracy requirements on forms filled out by hand. In December 1993, NIST called industry and research organizations working in the area of handwriting recognition to participate in a test to determine the state of the art in the area. A database of form images containing actual responses received by the US Census Bureau was provided. The handwritten responses are very loosely constrained in terms of writing style, format of response and choice of text. The sizes of the lexicons provided are very large (about 50000 entries) and yet the coverage is incomplete (about 70%). In this paper we discuss the approach taken by CEDAR to automate the task of reading the census forms. The subtasks of field extraction and phrase recognition are described.