Batch mode active learning based multi-view text classification

  • Authors:
  • Xue Zhang;Dong-yan Zhao;Li-wei Chen;Wang-hua Min

  • Affiliations:
  • ICST of Peking University, Beijing, China;ICST of Peking University, Beijing, China and Key Laboratory of Computational Linguistics, Peking University, Ministry of Education, China;ICST of Peking University, Beijing, China;ICST of Peking University, Beijing, China

  • Venue:
  • FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 7
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The goal of active learning is to select the most informative examples for manual labeling in order to reduce the effort involved in acquiring labeled examples, which is very important for large-scale text classification. However, most of the previous studies in active learning have focused on selecting a single unlabeled example at a time which could be inefficient since the model has to be retrained for every new labeled example. In this paper we propose a novel simple batch mode active learning(BMAL) method based on farthest-first traversal to select a number of informative examples for labeling simultaneously in each iteration. Furthermore, we combine the BMAL with a multi-view framework in order to improve its execution efficiency. The k nearest neighbor(kNN) model is used as the baseline classifier for its simplicity and efficiency. Extensive experiments on standard dataset have shown that our algorithm is more effective than the single mode counterpart and the baseline classifier.