Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Analyzing the effectiveness and applicability of co-training
Proceedings of the ninth international conference on Information and knowledge management
Recognition of Cursive Roman Handwriting - Past, Present and Future
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Exploitation of Unlabeled Sequences in Hidden Markov Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast Handwriting Recognition for Indexing Historical Documents
DIAL '04 Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL'04)
ICML '06 Proceedings of the 23rd international conference on Machine learning
Rejection strategies for offline handwritten text line recognition
Pattern Recognition Letters
Learning to Group Text Lines and Regions in Freeform Handwritten Notes
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 01
Semisupervised Learning of Hidden Markov Models via a Homotopy Method
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Novel Connectionist System for Unconstrained Handwriting Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Semi-Supervised Learning
Self-training for handwritten text line recognition
CIARP'10 Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and applications
Pattern classification and clustering: A review of partially supervised learning approaches
Pattern Recognition Letters
Hi-index | 0.00 |
Handwriting recognition is an emerging subfield of human-computer interaction that has many potential industrial applications, e.g. in postal automation, bank check processing, and automatic form reading. Training a recognizer, however, requires a substantial amount of training examples together with their corresponding ground truth, which needs to be created by humans. A promising way to significantly reduce this effort, and hence cut system development costs, is offered by semi-supervised learning, in which both text with and text without transcription is used for training. However, until today there is no straightforward and established way of semi-supervised learning, particularly not for handwriting recognition. In the self-training approach, an initially trained recognition system creates a new training set from unlabeled data. Using this set, a new recognizer is created. The creation of the training set is done by selecting elements from the unlabeled set, according to their recognition confidence. The success of self-training depends crucially on the data selected. In this paper, we test and compare different rules used to select new training data for single word recognition with and without additional language information in the form of a dictionary. We demonstrate that it is possible to substantially increase the recognition accuracy for both systems.