Hybrid Contextural Text Recognition with String Matching

Authors:
R. M. K. Sinha;B. Prasada;G. F. Houle;M. Sabourin
Affiliations:
-;-;-;-
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
1993

Citing 11
Cited 8

A note on undetected typing errors

Communications of the ACM
On the Recognition of Printed Characters of Any Font and Size

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast approximate string matching

Software—Practice & Experience
Fast string matching with k-differences

Journal of Computer and System Sciences - 26th IEEE Conference on Foundations of Computer Science, October 21-23, 1985
Contextual word recognition using probabilistic relaxation labeling

Pattern Recognition
Visual text recognition through contextual processing

Pattern Recognition
On partitioning a dictionary for visual text recognition

Pattern Recognition
The String-to-String Correction Problem

Journal of the ACM (JACM)
Approximate String Matching

ACM Computing Surveys (CSUR)
Computer programs for detecting and correcting spelling errors

Communications of the ACM
The Art of Computer Programming Volumes 1-3 Boxed Set

The Art of Computer Programming Volumes 1-3 Boxed Set

Original Contribution: Optical character recognition by a neural network

Neural Networks
A Survey of Methods and Strategies in Character Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Twenty Years of Document Image Analysis in PAMI

IEEE Transactions on Pattern Analysis and Machine Intelligence
Machine Printed Text and Handwriting Identification in Noisy Document Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fuzzy technique based recognition of handwritten characters

Image and Vision Computing
Context information from search engines for document recognition

Pattern Recognition Letters
Fuzzy technique based recognition of handwritten characters

WILF'03 Proceedings of the 5th international conference on Fuzzy Logic and Applications
Off-line cursive script recognition: current advances, comparisons and remaining problems

Artificial Intelligence Review

Quantified Score

Hi-index	0.14

Visualization

Abstract

The hybrid contextural algorithm for reading real-life documents printed in varying fonts of any size is presented. Text is recognized progressively in three passes. The first pass is used to generate character hypothesis, the second to generate word hypothesis, and the third to verify the word hypothesis. During the first pass, isolated characters are recognized using a dynamic contour warping classifier. Transient statistical information is collected to accelerate the recognition process and to verify hypotheses in later processing. A transient dictionary consisting of high confidence nondictionary words is constructed in this pass. During the second pass, word-level hypotheses are generated using hybrid contextual text processing. Nondictionary words are recognized using a modified Viterbi algorithm, a string matching algorithm utilizing n grams, special handlers for touching characters, and pragmatic handlers for numerals, punctuation, hyphens, apostrophes, and a prefix/suffix handler. This processing usually generates several word hypothesis. During the third pass, word-level verification occurs.