Improving Generalization with Active Learning
Machine Learning - Special issue on structured connectionist systems
Computers and Intractability; A Guide to the Theory of NP-Completeness
Computers and Intractability; A Guide to the Theory of NP-Completeness
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
Toward Optimal Active Learning through Sampling Estimation of Error Reduction
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Support Vector Machine Active Learning with Application sto Text Classification
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Employing EM and Pool-Based Active Learning for Text Classification
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Large Margin Methods for Structured and Interdependent Output Variables
The Journal of Machine Learning Research
Introduction to the CoNLL-2002 shared task: language-independent named entity recognition
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Multi-Criterion Active Learning in Conditional Random Fields
ICTAI '06 Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence
Multi-criteria-based active learning for named entity recognition
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Comparisons of sequence labeling algorithms and extensions
Proceedings of the 24th international conference on Machine learning
A two-phase hybrid of semi-supervised and active learning approach for sequence labeling
Intelligent Data Analysis
Hi-index | 0.00 |
Sequence labeling problem is commonly encountered in many natural language and query processing tasks. SVMstructis a supervised learning algorithm that provides a flexible and effective way to solve this problem. However, a large amount of training examples is often required to train SVMstruct, which can be costly for many applications that generate long and complex sequence data. This paper proposes an active learning technique to select the most informative subset of unlabeled sequences for annotation by choosing sequences that have largest uncertainty in their prediction. A unique aspect of active learning for sequence labeling is that it should take into consideration the effort spent on labeling sequences, which depends on the sequence length. A new active learning technique is proposed to use dynamic programming to identify the best subset of sequences to be annotated, taking into account both the uncertainty and labeling effort. Experiment results show that our SVMstructactive learning technique can significantly reduce the number of sequences to be labeled while outperforming other existing techniques.