Concept based text classification using labeled and unlabeled data

  • Authors:
  • Ping Gu;Qingsheng Zhu;Xiping He

  • Affiliations:
  • Dept. of Computer Science, Chongqing University, Chongqing, China;Dept. of Computer Science, Chongqing University, Chongqing, China;Dept. of Computer Science, Chongqing University, Chongqing, China

  • Venue:
  • ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent work has shown improvements in text clustering and classification by integrating conceptual features extracted from background knowledge. In this paper we address the problem of text classification with labeled data and unlabeled data. We propose a Latent Bayes Ensemble model based on word-concept mapping and transductive boosting method. With the knowledge extracted from ontologies, we hope to improve the classification accuracy even with large amounts of unlabeled documents. We conducted several experiments on two well-known corpora and the results are compared with Naïve Bayes and TSVM classifiers.