COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
A sequential algorithm for training text classifiers
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Improving Generalization with Active Learning
Machine Learning - Special issue on structured connectionist systems
A maximum entropy approach to natural language processing
Computational Linguistics
Toward Optimal Active Learning through Sampling Estimation of Error Reduction
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Boosting the margin: A new explanation for the effectiveness of voting methods
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Discriminative Reranking for Natural Language Parsing
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Employing EM and Pool-Based Active Learning for Text Classification
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Word-sense disambiguation using decomposable models
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Online Choice of Active Learning Algorithms
The Journal of Machine Learning Research
Active learning for statistical natural language parsing
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Corpus-based statistical sense resolution
HLT '93 Proceedings of the workshop on Human Language Technology
An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Multi-criteria-based active learning for named entity recognition
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
An empirical study of the behavior of active learning for word sense disambiguation
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Active learning for logistic regression: an evaluation
Machine Learning
A stopping criterion for active learning
Computer Speech and Language
Multi-criteria-based strategy to stop active learning for data annotation
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Distributed language modeling for N-best list re-ranking
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Active learning with statistical models
Journal of Artificial Intelligence Research
Literature survey of active learning in multimedia annotation and retrieval
Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service
Hi-index | 0.00 |
To solve the knowledge bottleneck problem, active learning has been widely used for its ability to automatically select the most informative unlabeled examples for human annotation. One of the key enabling techniques of active learning is uncertainty sampling, which uses one classifier to identify unlabeled examples with the least confidence. Uncertainty sampling often presents problems when outliers are selected. To solve the outlier problem, this paper presents two techniques, sampling by uncertainty and density (SUD) and density-based re-ranking. Both techniques prefer not only the most informative example in terms of uncertainty criterion, but also the most representative example in terms of density criterion. Experimental results of active learning for word sense disambiguation and text classification tasks using six real-world evaluation data sets demonstrate the effectiveness of the proposed methods.