Decoder Banks: Versatility, Automation, and High Accuracy without Supervised Training

Authors:
Prateek Sarkar;Henry S. Baird
Affiliations:
Palo Alto Research Center, CA;Palo Alto Research Center, CA
Venue:
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 2 - Volume 02
Year:
2004

Citing 0
Cited 2

Adaptive OCR with Limited User Feedback

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Document image analysis for digital libraries

Proceedings of the 2006 international workshop on Research issues in digital libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

A methodology using decoder banks is proposed for high-accuracy, fully automatic recognition of machine printed text across a wide range of challenging image qualities, without requiring manual intervention or supervised training. This approach is made possible by two crucial properties of document image decoding (DID) technology: (1) it is trainable for high accuracy across a wide range of explicitly parameterized image degradations; and (2) decoders for arbitrary parameter settings can be generated automatically. We report the results of large-scale experiments on synthetic images which demonstrate that, when many pretrained decoders are applied in parallel to an input image with unknown parameters, the decoder that yields the highest accuracy is often the one that exhibits the highest DID posterior 'Viterbi score'. When implemented naively, in a brute-force manner, decoder banks are computationally intensive: but we suggest ways that this cost may be reduced with no loss of versatility, automation, or accuracy.