Flexible and robust co-regularized multi-domain graph clustering

Authors:
Wei Cheng;Xiang Zhang;Zhishan Guo;Yubao Wu;Patrick F. Sullivan;Wei Wang
Affiliations:
University of North Carolina at Chapel Hill, CARRBORO, N. Carolina, USA;Case Western Reserve University, Cleveland, USA;University of North Carolina at Chapel Hill, Chapel Hill, N. Carolina, USA;Case Western Reserve University, Cleveland, USA;University of North Carolina at Chapel Hill, Chapel Hill, N. Carolina, USA;University of California at Los Angeles, Los Angeles, USA
Venue:
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2013

Citing 18
Cited 0

The Cluster Dissection and Analysis Theory FORTRAN Programs Examples

The Cluster Dissection and Analysis Theory FORTRAN Programs Examples
Document clustering based on non-negative matrix factorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Convex Optimization

Convex Optimization
Solving cluster ensemble problems by bipartite graph partitioning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Multi-View Clustering

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Orthogonal nonnegative matrix t-factorizations for clustering

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Spectral clustering and transductive learning with multiple views

Proceedings of the 24th international conference on Machine learning
Cost-effective outbreak detection in networks

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
An ensemble framework for clustering protein–protein interaction networks

Bioinformatics
Self-taught clustering

Proceedings of the 25th international conference on Machine learning
Multi-view clustering via canonical correlation analysis

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Clustering with Multiple Graphs

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
TEAM

Bioinformatics
Flexible constrained spectral clustering

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Community detection via heterogeneous interaction analysis

Data Mining and Knowledge Discovery
Mining Heterogeneous Information Networks: Principles and Methodologies

Mining Heterogeneous Information Networks: Principles and Methodologies
Inferring novel associations between SNP sets and gene sets in eQTL study using sparse graphical model

Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multi-view graph clustering aims to enhance clustering performance by integrating heterogeneous information collected in different domains. Each domain provides a different view of the data instances. Leveraging cross-domain information has been demonstrated an effective way to achieve better clustering results. Despite the previous success, existing multi-view graph clustering methods usually assume that different views are available for the same set of instances. Thus instances in different domains can be treated as having strict one-to-one relationship. In many real-life applications, however, data instances in one domain may correspond to multiple instances in another domain. Moreover, relationships between instances in different domains may be associated with weights based on prior (partial) knowledge. In this paper, we propose a flexible and robust framework, CGC (Co-regularized Graph Clustering), based on non-negative matrix factorization (NMF), to tackle these challenges. CGC has several advantages over the existing methods. First, it supports many-to-many cross-domain instance relationship. Second, it incorporates weight on cross-domain relationship. Third, it allows partial cross-domain mapping so that graphs in different domains may have different sizes. Finally, it provides users with the extent to which the cross-domain instance relationship violates the in-domain clustering structure, and thus enables users to re-evaluate the consistency of the relationship. Extensive experimental results on UCI benchmark data sets, newsgroup data sets and biological interaction networks demonstrate the effectiveness of our approach.