COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
A sequential algorithm for training text classifiers
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Machine Learning
A study of thresholding strategies for text categorization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Toward Optimal Active Learning through Sampling Estimation of Error Reduction
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Query Learning with Large Margin Classifiers
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Single-shot detection of multiple categories of text using parametric mixture models
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Active learning: theory and applications
Active learning: theory and applications
Support vector machine active learning with applications to text classification
The Journal of Machine Learning Research
Automatically Labeling Video Data Using Multi-class Active Learning
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Active Learning to Recognize Multiple Types of Plankton
The Journal of Machine Learning Research
A note on Platt's probabilistic outputs for support vector machines
Machine Learning
PinDr0p: using single-ended audio features to determine call provenance
Proceedings of the 17th ACM conference on Computer and communications security
Dual active feature and sample selection for graph classification
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
A weakly-supervised approach to argumentative zoning of scientific documents
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Active learning for hierarchical text classification
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Unsupervised multi-label text classification using a world knowledge ontology
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Semantic Labelling for Document Feature Patterns Using Ontological Subjects
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Mapping semantic knowledge for unsupervised text categorisation
ADC '13 Proceedings of the Twenty-Fourth Australasian Database Conference - Volume 137
Active learning with multi-label SVM classification
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Scaling short-answer grading by combining peer assessment with algorithmic scoring
Proceedings of the first ACM conference on Learning @ scale conference
Hi-index | 0.00 |
Labeling text data is quite time-consuming but essential for automatic text classification. Especially, manually creating multiple labels for each document may become impractical when a very large amount of data is needed for training multi-label text classifiers. To minimize the human-labeling efforts, we propose a novel multi-label active learning approach which can reduce the required labeled data without sacrificing the classification accuracy. Traditional active learning algorithms can only handle single-label problems, that is, each data is restricted to have one label. Our approach takes into account the multi-label information, and select the unlabeled data which can lead to the largest reduction of the expected model loss. Specifically, the model loss is approximated by the size of version space, and the reduction rate of the size of version space is optimized with Support Vector Machines (SVM). An effective label prediction method is designed to predict possible labels for each unlabeled data point, and the expected loss for multi-label data is approximated by summing up losses on all labels according to the most confident result of label prediction. Experiments on several real-world data sets (all are publicly available) demonstrate that our approach can obtain promising classification result with much fewer labeled data than state-of-the-art methods.