Batch-Mode Active Learning with Semi-supervised Cluster Tree for Text Classification

  • Authors:
  • Zhaocai Sun;Yunming Ye;Xiaofeng Zhang;Zhexue Huang;Shudong Chen;Zhi Liu

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In web mining, there are situations in which only few data is labeled which imposes difficulties on traditional web page classification algorithms. Active learning scheme is then proposed to sample the most representative unlabeled data, which are then annotated by external oracles. Most present active methods are based on series-mode query strategy, which deduces the process of active learning inefficient and unstable. In this paper, we propose a novel text oriented active semi-supervised classification model, which is so-called active SSC. Comparing with other active approaches, our model has the characteristic of comprehensibility, and thus it is easy to design a batch-mode query strategy. Experimental results on public text data showed our method is an effect and stable active approach.