Document clustering using nonnegative matrix factorization

Authors:
Farial Shahnaz;Michael W. Berry;V. Paul Pauca;Robert J. Plemmons
Affiliations:
Department of Computer Science, University of Tennessee, Knoxville, TN;Department of Computer Science, University of Tennessee, Knoxville, TN;Department of Computer Science, Wake Forest, University, Winston-Salem, NC;Department of Computer Science, Wake Forest, University, Winston-Salem, NC
Venue:
Information Processing and Management: an International Journal
Year:
2006

Citing 5
Cited 13

Understanding search engines: mathematical modeling and text retrieval

Understanding search engines: mathematical modeling and text retrieval
Matrices, Vector Spaces, and Information Retrieval

SIAM Review
Data Mining: Introductory and Advanced Topics

Data Mining: Introductory and Advanced Topics
Document clustering based on non-negative matrix factorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces

Neural Computation

Email Surveillance Using Non-negative Matrix Factorization

Computational & Mathematical Organization Theory
Nonnegative matrix factorization with constrained second-order optimization

Signal Processing
Inference and evaluation of the multinomial mixture model for text clustering

Information Processing and Management: an International Journal
SVD based initialization: A head start for nonnegative matrix factorization

Pattern Recognition
Non-negative matrix factorization with α-divergence

Pattern Recognition Letters
Nonnegative matrix factorization with quadratic programming

Neurocomputing
Gene tree labeling using nonnegative matrix factorization on biomedical literature

Computational Intelligence and Neuroscience - Advances in Nonnegative Matrix and Tensor Factorization
Structural Identifiability in Low-Rank Matrix Factorization

COCOON '08 Proceedings of the 14th annual international conference on Computing and Combinatorics
Fast nonnegative matrix factorization algorithms using projected gradient approaches for large-scale problems

Computational Intelligence and Neuroscience - Advances in Nonnegative Matrix and Tensor Factorization
A new visual search interface for web browsing

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Nonnegative factor analysis for text document clustering

SMO'09 Proceedings of the 9th WSEAS international conference on Simulation, modelling and optimization
Tumor clustering using nonnegative matrix factorization with gene selection

IEEE Transactions on Information Technology in Biomedicine - Special section on biomedical informatics
Nonnegative Matrix Factorization on Orthogonal Subspace

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

A methodology for automatically identifying and clustering semantic features or topics in a heterogeneous text collection is presented. Textual data is encoded using a low rank nonnegative matrix factorization algorithm to retain natural data nonnegativity, thereby eliminating the need to use subtractive basis vector and encoding calculations present in other techniques such as principal component analysis for semantic feature abstraction. Existing techniques for non-negative matrix factorization are reviewed and a new hybrid technique for nonnegative matrix factorization is proposed. Performance evaluations of the proposed method are conducted on a few benchmark text collections used in standard topic detection studies.