Offline General Handwritten Word Recognition Using an Approximate BEAM Matching Algorithm

Authors:
John T. Favata
Affiliations:
State Univ. of New York , Buffalo
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2001

Citing 15
Cited 10

Off-Line Cursive Script Word Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Artificial intelligence (3rd ed.)

Artificial intelligence (3rd ed.)
Interpreting handwritten text in a constrained domain

Interpreting handwritten text in a constrained domain
Handwritten Word Recognition Using Segmentation-Free Hidden Markov Modeling and Segmentation-Based Dynamic Programming Techniques

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Survey of Methods and Strategies in Character Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Discriminant Adaptive Nearest Neighbor Classification

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Lexicon Driven Approach to Handwritten Word Recognition for Real-Time Applications

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Simple Algorithm for Nearest Neighbor Search in High Dimensions

IEEE Transactions on Pattern Analysis and Machine Intelligence
Chaincode Contour Processing for Handwritten Word Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Role of Holistic Paradigms in Handwritten Word Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Off-Line Handwritten Word Recognition Using a Hidden Markov Model Type Stochastic Network

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Bootstrap Technique for Nearest Neighbor Classifier Design

IEEE Transactions on Pattern Analysis and Machine Intelligence
Off Line Arabic Character Recognition - A Survey

ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Handwritten word recognition with character and inter-character neural networks

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Lexicon-Driven Segmentation and Recognition of Handwritten Character Strings for Japanese Address Reading

IEEE Transactions on Pattern Analysis and Machine Intelligence
Recognition of Cursive Roman Handwriting - Past, Present and Future

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Emergency medicine, disease surveillance, and informatics

dg.o '05 Proceedings of the 2005 national conference on Digital government research
A Lexicon Reduction Strategy in the Context of Handwritten Medical Forms

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Indexing and searching handwritten medical forms

dg.o '06 Proceedings of the 2006 international conference on Digital government research
M-band packet wavelet Farsi handwriting word recognition Farsi script segmentation based on new wavelet function

WAMUS'05 Proceedings of the 5th WSEAS International Conference on Wavelet Analysis and Multirate Systems
A multiple feature/resolution scheme to Arabic (Indian) numerals recognition using hidden Markov models

Signal Processing
A metasynthetic approach for segmenting handwritten Chinese character strings

Pattern Recognition Letters
Enhancing trie-based syntactic pattern recognition using AI heuristic search strategies

ICAPR'05 Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I
Off-line cursive script recognition: current advances, comparisons and remaining problems

Artificial Intelligence Review

Quantified Score

Hi-index	0.14

Visualization

Abstract

A recognition system for general isolated offline handwritten words using an approximate segment-string matching algorithm is described. The fundamental paradigm employed is a character-based segment-then-recognize/match strategy. Additional user supplied contextual information in the form of a lexicon guides a graph search to estimate the most likely word image identity. This system is designed to operate robustly in the presence of document noise, poor handwriting, and lexicon errors, so this basic strategy is significantly extended and enhanced. A preprocessing step is initially applied to the image to remove noise artifacts and normalize the handwriting. An oversegmentation approach is taken to improve the likelihood of capturing the individual characters embedded in the word. The goal is to produce a segmentation point set that contains one subset which is the correct segmentation of the word image. This is accomplished by a segmentation module, employing several independent detection rules based on certain key features, which finds the most likely segmentation points of the word. Next, a sliding window algorithm, using a character recognition algorithm with a very good noncharacter rejection response, is used to find the most likely character boundaries and identities. A directed graph is then constructed that contains many possible interpretations of the word image, many implausible. Contextual information is used at this point and the lexicon is matched to the graph in a breath-first manner, under an appropriate metric. The matching algorithm employs a BEAM search algorithm with several heuristics to compensate for the most likely errors contained in the interpretation graph, including missing segments from segmentation failures, misrecognition of the segments, and lexicon errors. The most likely graph path and associated confidence is computed for each lexicon word to produce a final lexicon ranking. These confidences are very reliable and can be later thresholded to decrease total recognition error. Experiments highlighting the characteristics of this algorithm are given.