Document clustering using nonnegative matrix factorization

Authors:
Farial Shahnaz;Michael W. Berry;V.Paul Pauca;Robert J. Plemmons
Affiliations:
Department of Computer Science, University of Tennessee, Knoxville, TN 37996-3450, USA;Department of Computer Science, University of Tennessee, Knoxville, TN 37996-3450, USA;Department of Computer Science, Wake Forest, University, Winston-Salem, NC 27109, USA;Department of Computer Science, Wake Forest, University, Winston-Salem, NC 27109, USA
Venue:
Information Processing and Management: an International Journal
Year:
2006

Citing 5
Cited 41

Understanding search engines: mathematical modeling and text retrieval

Understanding search engines: mathematical modeling and text retrieval
Matrices, Vector Spaces, and Information Retrieval

SIAM Review
Data Mining: Introductory and Advanced Topics

Data Mining: Introductory and Advanced Topics
Document clustering based on non-negative matrix factorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces

Neural Computation

Pattern Discovery for High-Dimensional Binary Datasets

Neural Information Processing
Blind Image Separation Using Nonnegative Matrix Factorization with Gibbs Smoothing

Neural Information Processing
Data Clustering with Semi-binary Nonnegative Matrix Factorization

ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
Nonnegative Matrix Factorization (NMF) Based Supervised Feature Selection and Adaptation

IDEAL '08 Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning
Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds

IDEAL '08 Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning
Using underapproximations for sparse nonnegative matrix factorization

Pattern Recognition
Mining fuzzy frequent itemsets for hierarchical document clustering

Information Processing and Management: an International Journal
Projected gradient method for kernel discriminant nonnegative matrix factorization and the applications

Signal Processing
Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce

Proceedings of the 19th international conference on World wide web
Bars problem solving - new neural network method and comparison

MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Molecular cancer class discovery using non-negative matrix factorization with sparseness constraint

ICIC'07 Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications
Comparison of neural network Boolean factor analysis method with some other dimension reduction methods on bars problem

PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
Knowledge extraction with non-negative matrix factorization for text classification

IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
Normalized dimensionality reduction using nonnegative matrix factorization

Neurocomputing
Personalized blog content recommender system for mobile phone users

International Journal of Human-Computer Studies
Orthogonal nonnegative matrix tri-factorization for co-clustering: Multiplicative updates on Stiefel manifolds

Information Processing and Management: an International Journal
Nonnegative shared subspace learning and its application to social media retrieval

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast dimension reduction based on NMF

ISICA'10 Proceedings of the 5th international conference on Advances in computation and intelligence
Musical onset detection by means of non-negative matrix factorization

ICS'10 Proceedings of the 14th WSEAS international conference on Systems: part of the 14th WSEAS CSCC multiconference - Volume I
A new method for musical onset detection in polyphonic piano music

ICCOMP'10 Proceedings of the 14th WSEAS international conference on Computers: part of the 14th WSEAS CSCC multiconference - Volume II
Subtractive initialization of nonnegative matrix factorizations for document clustering

WILF'11 Proceedings of the 9th international conference on Fuzzy logic and applications
Importance Sampling for a Monte Carlo Matrix Multiplication Algorithm, with Application to Information Retrieval

SIAM Journal on Scientific Computing
Non-negative matrix factorization based text mining: feature extraction and classification

ICONIP'06 Proceedings of the 13th international conference on Neural Information Processing - Volume Part II
Accelerated multiplicative updates and hierarchical als algorithms for nonnegative matrix factorization

Neural Computation
Nonnegative matrix factorization via generalized product rule and its application for classification

LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
Efficient Nonnegative Matrix Factorization via projected Newton method

Pattern Recognition
Multistability of α-divergence based NMF algorithms

Computers & Mathematics with Applications
An algorithm for fuzzy-based sentence-level document clustering for micro-level contradiction analysis

Proceedings of the International Conference on Advances in Computing, Communications and Informatics
An interior point trust region method for nonnegative matrix factorization

Neurocomputing
Correntropy-Based document clustering via nonnegative matrix factorization

ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
Resilience and optimization of identifiable bipartite graphs

Discrete Applied Mathematics
Semi-supervised clustering via constrained symmetric non-negative matrix factorization

BI'12 Proceedings of the 2012 international conference on Brain Informatics
Early and Late Fusion Methods for the Automatic Creation of Twitter Lists

ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Modified subspace Barzilai-Borwein gradient method for non-negative matrix factorization

Computational Optimization and Applications
Non-negative matrix factorization on low-power architectures: a comparative study

Proceedings of the 20th European MPI Users' Group Meeting
Structure preserving non-negative matrix factorization for dimensionality reduction

Computer Vision and Image Understanding
Augmenting MATLAB with semantic objects for an interactive visual environment

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
Semantic smoothing for text clustering

Knowledge-Based Systems
Subtractive clustering for seeding non-negative matrix factorizations

Information Sciences: an International Journal
A convergent algorithm for orthogonal nonnegative matrix factorization

Journal of Computational and Applied Mathematics
Global convergence of modified multiplicative updates for nonnegative matrix factorization

Computational Optimization and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

A methodology for automatically identifying and clustering semantic features or topics in a heterogeneous text collection is presented. Textual data is encoded using a low rank nonnegative matrix factorization algorithm to retain natural data nonnegativity, thereby eliminating the need to use subtractive basis vector and encoding calculations present in other techniques such as principal component analysis for semantic feature abstraction. Existing techniques for nonnegative matrix factorization are reviewed and a new hybrid technique for nonnegative matrix factorization is proposed. Performance evaluations of the proposed method are conducted on a few benchmark text collections used in standard topic detection studies.