Minimum-risk training for semi-Markov conditional random fields with application to handwritten Chinese/Japanese text recognition

Authors:
Xiang-Dong Zhou;Yan-Ming Zhang;Feng Tian;Hong-An Wang;Cheng-Lin Liu
Affiliations:
-;-;-;-;-
Venue:
Pattern Recognition
Year:
2014

Citing 37
Cited 0

Modified Quadratic Discriminant Functions and the Application to Chinese Character Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
The nature of statistical learning theory

The nature of statistical learning theory
Overall risk criterion estimation of hidden Markov model parameters

Speech Communication
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
An Alternate Objective Function for Markovian Fields

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Support vector machine learning for interdependent and structured output spaces

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Building Compact Classifier for Large Character Set Recognition Using Discriminative Feature Extraction

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Investigating loss functions and optimization methods for discriminative learning of label sequences

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
High Accuracy Handwritten Chinese Character Recognition Using Quadratic Classifiers with Discriminative Feature Extraction

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
Character Recognition Systems: A Guide for Students and Practitioners

Character Recognition Systems: A Guide for Students and Practitioners
Training conditional random fields with multivariate evaluation measures

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Minimum risk annealing for training log-linear models

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Large-Margin Discriminative Training of Hidden Markov Models for Speech Recognition

ICSC '07 Proceedings of the International Conference on Semantic Computing
Large-margin minimum classification error training: A theoretical risk minimization perspective

Computer Speech and Language
Modified MMI/MPE: a direct evaluation of the margin in speech recognition

Proceedings of the 25th international conference on Machine learning
Off-line recognition of realistic Chinese handwriting using segmentation-free strategy

Pattern Recognition
Minimum tag error for discriminative training of conditional random fields

Information Sciences: an International Journal
Training data selection for improving discriminative training of acoustic models

Pattern Recognition Letters
Maximum Margin Training of Gaussian HMMs for Handwriting Recognition

ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
First- and second-order expectation semirings with applications to minimum-risk training on translation forests

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
A robust model for on-line handwritten japanese text recognition

International Journal on Document Analysis and Recognition - Special Issue DRR09
Large margin cost-sensitive learning of conditional random fields

Pattern Recognition
Softmax-margin CRFs: training log-linear models with cost functions

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Error approximation and minimum phone error acoustic model estimation

IEEE Transactions on Audio, Speech, and Language Processing
ICDAR 2011 Chinese Handwriting Recognition Competition

ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
CASIA Online and Offline Chinese Handwriting Databases

ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
Discriminative learning for minimum error classification [patternrecognition]

IEEE Transactions on Signal Processing
Large margin hidden Markov models for speech recognition

IEEE Transactions on Audio, Speech, and Language Processing
Approximate Test Risk Bound Minimization Through Soft Margin Estimation

IEEE Transactions on Audio, Speech, and Language Processing
An approach for real-time recognition of online Chinese handwritten sentences

Pattern Recognition
Handwritten Chinese Text Recognition by Integrating Multiple Contexts

IEEE Transactions on Pattern Analysis and Machine Intelligence
Large Margin Discriminative Semi-Markov Model for Phonetic Recognition

IEEE Transactions on Audio, Speech, and Language Processing
Minimum-risk training of approximate CRF-based NLP systems

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Minimum Risk Training for Handwritten Chinese/Japanese Text Recognition Using Semi-Markov Conditional Random Fields

ICDAR '13 Proceedings of the 2013 12th International Conference on Document Analysis and Recognition
Handwritten Chinese/Japanese Text Recognition Using Semi-Markov Conditional Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence

Quantified Score

Hi-index	0.01

Visualization

Abstract

Semi-Markov conditional random fields (semi-CRFs) are usually trained with maximum a posteriori (MAP) criterion which adopts the 0/1 cost for measuring the loss of misclassification. In this paper, based on our previous work on handwritten Chinese/Japanese text recognition (HCTR) using semi-CRFs, we propose an alternative parameter learning method by minimizing the risk on the training set, which has unequal misclassification costs depending on the hypothesis and the ground-truth. Based on this framework, three non-uniform cost functions are compared with the conventional 0/1 cost, and training data selection is incorporated to reduce the computational complexity. In experiments of online handwriting recognition on databases CASIA-OLHWDB and TUAT Kondate, we compared the performances of the proposed method with several widely used learning criteria, including conditional log-likelihood (CLL), softmax-margin (SMM), minimum classification error (MCE), large-margin MCE (LM-MCE) and max-margin (MM). On the test set (online handwritten texts) of ICDAR 2011 Chinese handwriting recognition competition, the proposed method outperforms the best system in competition.