Robust discriminant analysis of latent semantic feature for text categorization

  • Authors:
  • Jiani Hu;Weihong Deng;Jun Guo

  • Affiliations:
  • Beijing University of Posts and Telecommunications, Beijing, China;Beijing University of Posts and Telecommunications, Beijing, China;Beijing University of Posts and Telecommunications, Beijing, China

  • Venue:
  • FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a Discriminative Semantic Feature (DSF) method for vector space model based text categorization. The DSF method, which involves two stages, first reduces the dimension of the document vector space by Latent Semantic Indexing (LSI), and then applies a Robust linear Discriminant analysis Model (RDM), which improves the classical LDA by a energy-adaptive regularization criteria, to extract the discriminative semantic feature with enhanced discrimination power. As a result, DSF method can not only uncover latent semantic structure but also capture the discriminative feature. Comparative experiments on various state-of-art dimension reduction schemes such as our DSF, LSI, orthogonal centroid, two-stage LSI+LDA, LDA/QR and LDA/GSVD, are also performed. Experiments using the Reuters-21578 text collection show the proposed method performs better than other algorithms.