Segmentation-Driven recognition applied to numerical field extraction from handwritten incoming mail documents

Authors:
Clément Chatelain;Laurent Heutte;Thierry Paquet
Affiliations:
Laboratoire PSI, CNRS FRE 2645, Université de Rouen, Saint Etienne du Rouvray, France;Laboratoire PSI, CNRS FRE 2645, Université de Rouen, Saint Etienne du Rouvray, France;Laboratoire PSI, CNRS FRE 2645, Université de Rouen, Saint Etienne du Rouvray, France
Venue:
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Year:
2006

Citing 14
Cited 2

A tutorial on hidden Markov models and selected applications in speech recognition

Readings in speech recognition
A Lexicon Driven Approach to Handwritten Word Recognition for Real-Time Applications

IEEE Transactions on Pattern Analysis and Machine Intelligence
A structural/statistical feature based vector for handwritten character recognition

Pattern Recognition Letters
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Handwritten ZIP Code Recognition

ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Integration of hand-written address interpretation technology into the United States Postal Service Remote Computer Reader system

ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Combining One-Class Classifiers

MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
Segmentation of numeric strings

ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Handwritten Digit Recognition Using State-of-the-Art Techniques

IWFHR '02 Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition (IWFHR'02)
Confidence-Scoring Post-Processing for Off-Line Handwritten-Character Recognition Verification

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Numerical Sequence Extraction in Handwritten Incoming Mail Documents

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Two-Stage Classification System combining Model-Based and Discriminative Approaches

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1 - Volume 01
A Syntax-Directed Method for Numerical Field Extraction Using Classifier Combination

IWFHR '04 Proceedings of the Ninth International Workshop on Frontiers in Handwriting Recognition
The use of the area under the ROC curve in the evaluation of machine learning algorithms

Pattern Recognition

A multi-model selection framework for unknown and/or evolutive misclassification cost problems

Pattern Recognition
Bangla date field extraction in offline handwritten documents

Proceeding of the workshop on Document Analysis and Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a method for the automatic extraction of numerical fields (ZIP codes, phone numbers, etc.) from incoming mail documents. The approach is based on a segmentation-driven recognition that aims at locating isolated and touching digits among the textual information. A syntactical analysis is then performed on each line of text in order to filter the sequences that respect a particular syntax (number of digits, presence of separators) known by the system. We evaluate the performance of our system by means of the recall precision trade-off on a real incoming mail document database.