A novel reliable negative method based on clustering for learning from positive and unlabeled examples

  • Authors:
  • Bangzuo Zhang;Wanli Zuo

  • Affiliations:
  • College of Computer Science and Technology, Jilin University, ChangChun, China and College of Computer, Northeast Normal University, ChangChun, China;College of Computer Science and Technology, Jilin University, ChangChun, China

  • Venue:
  • AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper investigates a new approach for training text classifiers when only a small set of positive examples is available together with a large set of unlabeled examples. The key feature of this problem is that there are no negative examples for learning. Recently, a few techniques have been reported are based on building a classifier in two steps. In this paper, we introduce a novel method for the first step, which cluster the unlabeled and positive examples to identify the reliable negative document, and then run SVM iteratively. We perform a comprehensive evaluation with other two methods, and show experimentally that it is efficient and effective.