Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
Co-clustering documents and words using bipartite spectral graph partitioning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Document clustering with cluster refinement and model selection capabilities
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
A Min-max Cut Algorithm for Graph Partitioning and Data Clustering
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Centroid-Based Document Classification: Analysis and Experimental Results
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
IEEE Transactions on Knowledge and Data Engineering
Co-clustering by block value decomposition
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Semi-supervised graph clustering: a kernel approach
ICML '05 Proceedings of the 22nd international conference on Machine learning
Spectral clustering for multi-type relational data
ICML '06 Proceedings of the 23rd international conference on Machine learning
Document clustering with prior knowledge
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Orthogonal nonnegative matrix t-factorizations for clustering
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Star-Structured High-Order Heterogeneous Data Co-clustering Based on Consistent Information Theory
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Co-clustering Documents and Words Using Bipartite Isoperimetric Graph Partitioning
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Deriving semantics for image clustering from accumulated user feedbacks
Proceedings of the 15th international conference on Multimedia
Proceedings of the 17th international conference on World Wide Web
Incorporating User Provided Constraints into Document Clustering
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Non-negative matrix factorization for semi-supervised data clustering
Knowledge and Information Systems
Orthogonal nonnegative matrix tri-factorization for semi-supervised document co-clustering
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Parameter-less co-clustering for star-structured heterogeneous data
Data Mining and Knowledge Discovery
Semi-supervised clustering via constrained symmetric non-negative matrix factorization
BI'12 Proceedings of the 2012 international conference on Brain Informatics
Hi-index | 0.00 |
In order to derive high quality information from text, the field of text mining has advanced swiftly from simple document clustering to co-clustering with words and categories. However, document co-clustering without any prior knowledge or background information is a challenging problem. In this paper, we propose a Semi-Supervised Non-negative Matrix Factorization (SS-NMF) framework for document co-clustering. Our method computes new word-document and document-category matrices by incorporating user provided constraints through simultaneous distance metric learning and modality selection. Using an iterative algorithm, we perform tri-factorization of the new matrices to infer the document, category and word clusters. Theoretically, we show the convergence and correctness of SS-NMF co-clustering and the advantages of SS-NMF co-clustering over existing approaches. Through extensive experiments conducted on publicly available data sets, we demonstrate the superior performance of SS-NMF for document co-clustering.