Robust discriminant analysis of latent semantic feature for text categorization

Authors:
Jiani Hu;Weihong Deng;Jun Guo
Affiliations:
Beijing University of Posts and Telecommunications, Beijing, China;Beijing University of Posts and Telecommunications, Beijing, China;Beijing University of Posts and Telecommunications, Beijing, China
Venue:
FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
Year:
2006

Citing 10
Cited 2

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
A vector space model for automatic indexing

Communications of the ACM
Modern Information Retrieval

Modern Information Retrieval
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms

Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Empirical and Theoretical Comparisons of Selected Criterion Functions for Document Clustering

Machine Learning
A Two-Stage Linear Discriminant Analysis via QR-Decomposition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Generalizing discriminant analysis using the generalized singular value decomposition

IEEE Transactions on Pattern Analysis and Machine Intelligence
A new covariance estimate for Bayesian classifiers in biometric recognition

IEEE Transactions on Circuits and Systems for Video Technology

Robust, accurate and efficient face recognition from a single training image: A uniform pursuit approach

Pattern Recognition
Emulating biological strategies for uncontrolled face recognition

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a Discriminative Semantic Feature (DSF) method for vector space model based text categorization. The DSF method, which involves two stages, first reduces the dimension of the document vector space by Latent Semantic Indexing (LSI), and then applies a Robust linear Discriminant analysis Model (RDM), which improves the classical LDA by a energy-adaptive regularization criteria, to extract the discriminative semantic feature with enhanced discrimination power. As a result, DSF method can not only uncover latent semantic structure but also capture the discriminative feature. Comparative experiments on various state-of-art dimension reduction schemes such as our DSF, LSI, orthogonal centroid, two-stage LSI+LDA, LDA/QR and LDA/GSVD, are also performed. Experiments using the Reuters-21578 text collection show the proposed method performs better than other algorithms.