Modified Quadratic Discriminant Functions and the Application to Chinese Character Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
An Automatic Closed-Loop Methodology for Generating Character Groundtruth for Scanned Documents
IEEE Transactions on Pattern Analysis and Machine Intelligence
On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence
Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms
IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence
The IRESTE On/Off (IRONOFF) Dual Handwriting Database
ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Automatic Generation of Character Groundtruth for Scanned Documents: A Closed-Loop Approach
ICPR '96 Proceedings of the International Conference on Pattern Recognition (ICPR '96) Volume III-Volume 7276 - Volume 7276
Automatic Segmentation o the IAM Off-Line Database orHandwrittenEnglishText
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 4 - Volume 4
Incorporating Contextual Character Geometry in Word Recognition
IWFHR '02 Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition (IWFHR'02)
Transcript Mapping for Historic Handwritten Document Images
IWFHR '02 Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition (IWFHR'02)
Text Alignment with Handwritten Documents
DIAL '04 Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL'04)
Holistic Word Recognition for Handwritten Historical Documents
DIAL '04 Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL'04)
Construction of Handwriting Databases Using Transcript-Based Mapping
DIAL '04 Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL'04)
Segmentation of Japanese Handwritten Characters Using Peripheral Feature Analysis
ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 2 - Volume 2
Handwritten Numeral String Recognition: Character-Level vs. String-Level Classifier Training
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1 - Volume 01
Document Image Ground Truth Generation from Electronic Text
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 2 - Volume 02
IEEE Transactions on Pattern Analysis and Machine Intelligence
Global Shape Normalization for Handwritten Chinese Character Recognition: A New Method
IWFHR '04 Proceedings of the Ninth International Workshop on Frontiers in Handwriting Recognition
Character Recognition Systems: A Guide for Students and Practitioners
Character Recognition Systems: A Guide for Students and Practitioners
Further explorations in text alignment with handwritten documents
International Journal on Document Analysis and Recognition
Corpus-based HIT-MW database for offline recognition of general-purpose Chinese handwritten text
International Journal on Document Analysis and Recognition
Online Handwritten Japanese Character String Recognition Incorporating Geometric Context
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 01
Content-level Annotation of Large Collection of Printed Document Images
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
Text line and word segmentation of handwritten documents
Pattern Recognition
Combining Alignment Results for Historical Handwritten Document Analysis
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
A Tool for Ground-Truthing Text Lines and Characters in Off-Line Handwritten Chinese Documents
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Multimodal interactive transcription of text images
Pattern Recognition
ICFHR '10 Proceedings of the 2010 12th International Conference on Frontiers in Handwriting Recognition
ICFHR '10 Proceedings of the 2010 12th International Conference on Frontiers in Handwriting Recognition
Integrating Geometric Context for Text Alignment of Handwritten Chinese Documents
ICFHR '10 Proceedings of the 2010 12th International Conference on Frontiers in Handwriting Recognition
Improving Handwritten Chinese Text Recognition by Confidence Transformation
ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
CASIA Online and Offline Chinese Handwriting Databases
ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
Aligning transcripts to automatically segmented handwritten manuscripts
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Arabic handwriting recognition using structural and syntactic pattern attributes
Pattern Recognition
Hi-index | 0.01 |
Creating document image datasets with ground-truths of regions, text lines and characters is a prerequisite for document analysis research. However, ground-truthing large datasets is not only laborious and time consuming but also prone to errors due to the difficulty of character segmentation and the large variability of character shape, size and position. This paper describes an effective recognition-based annotation approach for ground-truthing handwritten Chinese documents. Under the Bayesian framework, the alignment of text line images with text transcript, which is the crucial step of annotation, is formulated as an optimization problem by incorporating geometric context of characters and character recognition model. We evaluated the alignment performance on a Chinese handwriting database CASIA-HWDB, which contains nearly four million character samples of 7356 classes and 5091 pages of unconstrained handwritten texts. The experimental results demonstrate the superiority of recognition-based text line alignment and the benefit of integrating geometric context. On a test set of 1015 handwritten pages (10,449 text lines), the proposed approach achieved character level alignment accuracy 92.32% when involving under-segmentation errors and 99.04% when excluding under-segmentation errors. The tool based on the proposed approach has been practically used for labeling handwritten Chinese documents.