Kernel-Based Text Classification on Statistical Manifold

  • Authors:
  • Shibin Zhou;Shidong Feng;Yushu Liu

  • Affiliations:
  • School of Computer Science and Technology, Beijing Institute of Technology, Beijing, P.R. China 100081;School of Computer Science and Technology, Beijing Institute of Technology, Beijing, P.R. China 100081;School of Computer Science and Technology, Beijing Institute of Technology, Beijing, P.R. China 100081

  • Venue:
  • ISNN '08 Proceedings of the 5th international symposium on Neural Networks: Advances in Neural Networks
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the text literature, a variety of useful kernel methods have been developed by many researchers. However, embedding text data into Euclidean space is the key characteristic of common kernels-based text categorization. In this paper, we focus on representation text vectors as points on Riemann manifold and use kernels to integrate discriminative and generative model. And then, we present diffuse kernel based on Dirichlet Compound Multinomial manifold (DCM manifold) which is a space about Dirichlet Compound Multinomial model combining inverse document frequency and information gain. More specifically, as demonstrated by our experimental results on various real-world text datasets, we show that the kernel based on this DCM manifold is more desirable than Euclidean space for text categorization. And our kernel method provides much better computational accuracy than some current state-of-the-art methods.