Semi-Supervised Text Classification Using Positive and Unlabeled Data

Authors:
Shuang Yu;Xueyuan Zhou;Chunping Li
Affiliations:
Department of Computer Science and Technology, Tsinghua University, Beijing, China;School of Software, Tsinghua University, Beijing, China;School of Software, Tsinghua University, Beijing, China
Venue:
Proceedings of the 2006 conference on Advances in Intelligent IT: Active Media Technology 2006
Year:
2006

Citing 3
Cited 0

Partially Supervised Classification of Text Documents

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Building Text Classifiers Using Positive and Unlabeled Examples

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Learning to classify texts using positive and unlabeled data

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Text classification using positive and unlabeled data refers to the problem of building text classifier using positive documents (P) of one class and unlabeled documents (U) of many other classes. U consists of positive and negative documents. Some existing methods for solving the PU-Learning problem are building a classifier in a two-step process. Generally speaking, these existing methods do not perform well when the size of P is too small. In this paper, we propose an improved method aiming at solving the PU-Learning problem with small P. This method combines the graph-based semi-supervised learning with the two-step method. Experiment indicates that our improved method performs well when the size of P is small.