Semi-supervised Collaborative Text Classification

Authors:
Rong Jin;Ming Wu;Rahul Sukthankar
Affiliations:
Michigan State University, East Lansing MI 48823, USA;Michigan State University, East Lansing MI 48823, USA;Intel Research Pittsburgh and Carnegie Mellon University, USA
Venue:
ECML '07 Proceedings of the 18th European conference on Machine Learning
Year:
2007

Citing 8
Cited 0

Automated learning of decision rules for text categorization

ACM Transactions on Information Systems (TOIS)
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Implicit link analysis for small web search

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A novel log-based relevance feedback technique in content-based image retrieval

Proceedings of the 12th annual ACM international conference on Multimedia
Graph-based text classification: learn from your neighbors

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Constructing informative prior distributions from domain knowledge in text classification

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most text categorization methods require text content of documents that is often difficult to obtain. We consider "Collaborative Text Categorization", where each document is represented by the feedback from a large number of users. Our study focuses on the semi-supervised case in which one key challenge is that a significant number of users have not rated any labeled document. To address this problem, we examine several semi-supervised learning methods and our empirical study shows that collaborative text categorization is more effective than content-based text categorization and the manifold regularization is more effective than other state-of-the-art semi-supervised learning methods.