Algorithms for clustering data
Algorithms for clustering data
Introduction to statistical pattern recognition (2nd ed.)
Introduction to statistical pattern recognition (2nd ed.)
Concept decompositions for large sparse text data using clustering
Machine Learning
SIAM Journal on Matrix Analysis and Applications
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Efficient Nonlinear Dimension Reduction for Clustered Data Using Kernel Functions
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Hi-index | 0.00 |
A new optimization criterion for discriminant analysis ispresented. The new criterion extends the optimization criteriaof the classical linear discriminant analysis (LDA) byintroducing the pseudo-inverse when the scatter matricesare singular. It is applicable regardless of the relative sizesof the data dimension and sample size, overcoming a limitationof the classical LDA. Recently, a new algorithm calledLDA/GSVD for structure-preserving dimension reductionhas been introduced, which extends the classical LDA tovery high-dimensional undersampled problems by using thegeneralized singular value decomposition (GSVD). The solutionfrom the LDA/GSVD algorithm is a special case of thesolution for our generalized criterion in this paper, which isalso based on GSVD.We also present an approximate solution for our GSVD-basedsolution, which reduces computational complexity byfinding sub-clusters of each cluster, and using their centroidsto capture the structure of each cluster. This reducedproblem yields much smaller matrices of which the GSVDcan be applied efficiently. Experiments on text data, withup to 7000 dimensions, show that the approximation algorithmproduces results that are close to those produced bythe exact algorithm.