Constrained co-clustering with non-negative matrix factorisation

Authors:
Amit Salunke;Xumin Liu;Manjeet Rege
Affiliations:
Department of Computer Science, Rochester Institute of Technology, Rochester, NY, USA.;Department of Computer Science, Rochester Institute of Technology, Rochester, NY, USA.;Department of Computer Science, Rochester Institute of Technology, Rochester, NY, USA
Venue:
International Journal of Business Intelligence and Data Mining
Year:
2012

Citing 34
Cited 0

OHSUMED: an interactive retrieval evaluation and new large test collection for research

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Data clustering: a review

ACM Computing Surveys (CSUR)
Co-clustering documents and words using bipartite spectral graph partitioning

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Iterative Double Clustering for Unsupervised and Semi-supervised Learning

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Document clustering based on non-negative matrix factorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Composing Web services on the Semantic Web

The VLDB Journal — The International Journal on Very Large Data Bases
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Co-clustering by block value decomposition

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Unsupervised content discovery in composite audio

Proceedings of the 13th annual ACM international conference on Multimedia
Document clustering with prior knowledge

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Orthogonal nonnegative matrix t-factorizations for clustering

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Co-clustering Documents and Words Using Bipartite Isoperimetric Graph Partitioning

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Information-theoretic metric learning

Proceedings of the 24th international conference on Machine learning
A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation

The Journal of Machine Learning Research
Introduction to Information Retrieval

Introduction to Information Retrieval
An efficient hierarchical clustering model for grouping web transactions

International Journal of Business Intelligence and Data Mining
Incorporating User Provided Constraints into Document Clustering

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
A matrix-based approach for semi-supervised document co-clustering

Proceedings of the 17th ACM conference on Information and knowledge management
Non-negative matrix factorization for semi-supervised data clustering

Knowledge and Information Systems
Semi-supervised graph clustering: a kernel approach

Machine Learning
Latent class models for collaborative filtering

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Fragmenting very large XML data warehouses via K-means clustering algorithm

International Journal of Business Intelligence and Data Mining
Data clustering: 50 years beyond K-means

Pattern Recognition Letters
Discovering homogenous service communities through web service clustering

SOCASE'08 Proceedings of the 2008 AAMAS international conference on Service-oriented computing: agents, semantics, and engineering
Non Negative Matrix Factorisation clustering capabilities; application on multivariate image segmentation

International Journal of Business Intelligence and Data Mining
Clustering WSDL Documents to Bootstrap the Discovery of Web Services

ICWS '10 Proceedings of the 2010 IEEE International Conference on Web Services
Measuring Similarity of Web Services Based on WSDL

ICWS '10 Proceedings of the 2010 IEEE International Conference on Web Services
On Service Community Learning: A Co-clustering Approach

ICWS '10 Proceedings of the 2010 IEEE International Conference on Web Services
Non-Negative Matrix Factorization for Semisupervised Heterogeneous Data Coclustering

IEEE Transactions on Knowledge and Data Engineering
Co-clustering analysis of weblogs using bipartite spectral projection approach

KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part III
Efficient Semi-supervised Spectral Co-clustering with Constraints

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
A clustering method of bloggers based on social annotations

International Journal of Business Intelligence and Data Mining
Co-clustering: A Versatile Tool for Data Analysis in Biomedical Informatics

IEEE Transactions on Information Technology in Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Co-clustering refers to the problem of deriving sub-matrices of the data matrix by simultaneously clustering the rows (data instances) and columns (features) of the matrix. While very effective in discovering useful knowledge, many of the co-clustering algorithms adopt a completely unsupervised approach. Integration of domain knowledge can guide the co-clustering process and greatly enhance the overall performance. We propose a semi-supervised Non-negative Matrix-factorisation (SS-NMF) based framework to integrate domain knowledge in the form of must-link and cannot-link constraints. Specifically, we augment the data matrix by integrating the constraints using metric learning and then perform NMF to obtain co-clustering. Under the proposed framework, we present two approaches to integrate domain knowledge, viz. a distance metric learning approach and an information theoretic metric learning approach. Through experiments performed on real-world web service data and publicly available text datasets, we demonstrate the performance of the proposed SS-NMF based approach for data co-clustering.