Batch mode active learning based multi-view text classification

Authors:
Xue Zhang;Dong-yan Zhao;Li-wei Chen;Wang-hua Min
Affiliations:
ICST of Peking University, Beijing, China;ICST of Peking University, Beijing, China and Key Laboratory of Computational Linguistics, Peking University, Ministry of Education, China;ICST of Peking University, Beijing, China;ICST of Peking University, Beijing, China
Venue:
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 7
Year:
2009

Citing 9
Cited 0

Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Query Learning with Large Margin Classifiers

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Unlabeled Data Can Degrade Classification Performance of Generative Classifiers

Proceedings of the Fifteenth International Florida Artificial Intelligence Research Society Conference
Employing EM and Pool-Based Active Learning for Text Classification

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Diverse ensembles for active learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Large-scale text categorization by batch mode active learning

Proceedings of the 15th international conference on World Wide Web
Semisupervised SVM batch mode active learning with applications to image retrieval

ACM Transactions on Information Systems (TOIS)
Active learning with multiple views

Journal of Artificial Intelligence Research
Active learning with committees for text categorization

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

The goal of active learning is to select the most informative examples for manual labeling in order to reduce the effort involved in acquiring labeled examples, which is very important for large-scale text classification. However, most of the previous studies in active learning have focused on selecting a single unlabeled example at a time which could be inefficient since the model has to be retrained for every new labeled example. In this paper we propose a novel simple batch mode active learning(BMAL) method based on farthest-first traversal to select a number of informative examples for labeling simultaneously in each iteration. Furthermore, we combine the BMAL with a multi-view framework in order to improve its execution efficiency. The k nearest neighbor(kNN) model is used as the baseline classifier for its simplicity and efficiency. Extensive experiments on standard dataset have shown that our algorithm is more effective than the single mode counterpart and the baseline classifier.