Integrating background knowledge into RBF networks for text classification

  • Authors:
  • Eric P. Jiang

  • Affiliations:
  • University of San Diego, San Diego, California

  • Venue:
  • AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Text classification is a problem applied to natural language texts that assigns a document into one or more predefined categories, based on its content. In this paper, we present an automatic text classification model that is based on the Radial Basis Function (RBF) networks. It utilizes valuable discriminative information in training data and incorporates background knowledge in model learning. This approach can be particularly advantageous for applications where labeled training data are in short supply. The proposed model has been applied for classifying spam email, and the experiments on some benchmark spam testing corpus have shown that the model is effective in learning to classify documents based on content and represents a competitive alternative to the well-known text classifiers such as naïve Bayes and SVM.