A comparison of approaches to on-line handwritten character recognition
A comparison of approaches to on-line handwritten character recognition
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Using the Fisher Kernel Method to Detect Remote Protein Homologies
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
On the algorithmic implementation of multiclass kernel-based vector machines
The Journal of Machine Learning Research
Large Margin Methods for Structured and Interdependent Output Variables
The Journal of Machine Learning Research
Accelerated training of conditional random fields with stochastic gradient methods
ICML '06 Proceedings of the 23rd international conference on Machine learning
Structured Prediction, Dual Extragradient and Bregman Projections
The Journal of Machine Learning Research
Comparisons of sequence labeling algorithms and extensions
Proceedings of the 24th international conference on Machine learning
Discriminative Learning of Max-Sum Classifiers
The Journal of Machine Learning Research
Semantic entity detection by integrating CRF and SVM
WAIM'10 Proceedings of the 11th international conference on Web-age information management
Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
Fast Structured Prediction Using Large Margin Sigmoid Belief Networks
International Journal of Computer Vision
Hi-index | 0.00 |
Learning a sequence classifier means learning to predict a sequence of output tags based on a set of input data items. For example, recognizing that a handwritten word is "cat", based on three images of handwritten letters and on general knowledge of English letter combinations, is a sequence classification task. This paper describes a new two-stage approach to learning a sequence classifier that is (i) highly accurate, (ii) scalable, and (iii) easy to use in data mining applications. The two-stage approach combines support vector machines (SVMs) and conditional random fields (CRFs). It is (i) highly accurate because it benefits from the maximum-margin nature of SVMs and also from the ability of CRFs to model correlations between neighboring output tags. It is (ii) scalable because the input to each SVM is a small training set, and the input to the CRF has a small number of features, namely the SVM outputs. It is (iii) easy to use because it combines existing published software in a straightforward way. In detailed experiments on the task of recognizing handwritten words, we show that the two-stage approach is more accurate, or faster and more scalable, or both, than leading other methods for learning sequence classifiers, including max-margin Markov networks (M3Ns) and standard CRFs.