An exploration of climate data using complex networks
Proceedings of the Third International Workshop on Knowledge Discovery from Sensor Data
An event-based framework for characterizing the evolutionary behavior of interaction graphs
ACM Transactions on Knowledge Discovery from Data (TKDD)
Fast algorithms for detecting overlapping functional modules in protein-protein interaction networks
CIBCB'09 Proceedings of the 6th Annual IEEE conference on Computational Intelligence in Bioinformatics and Computational Biology
Identifying and evaluating community structure in complex networks
Pattern Recognition Letters
PINCoC: a co-clustering based approach to analyze protein-protein interaction networks
IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
A graph-theoretic method for mining overlapping functional modules in protein interaction networks
ISBRA'08 Proceedings of the 4th international conference on Bioinformatics research and applications
A hybrid clustering algorithm for identifying modules in Protein Protein Interaction networks
International Journal of Data Mining and Bioinformatics
An exploration of climate data using complex networks
ACM SIGKDD Explorations Newsletter
The Journal of Machine Learning Research
Detection of communities and bridges in weighted networks
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Comparing predictive power in climate data: clustering matters
SSTD'11 Proceedings of the 12th international conference on Advances in spatial and temporal databases
Computational Biology and Chemistry
Scalable multiple global network alignment for biological data
Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
ICIC'11 Proceedings of the 7th international conference on Intelligent Computing: bio-inspired computing and applications
Robust Bayesian Clustering for Replicated Gene Expression Data
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Flexible and robust co-regularized multi-domain graph clustering
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 3.84 |
Motivation: Protein–Protein Interaction (PPI) networks are believed to be important sources of information related to biological processes and complex metabolic functions of the cell. The presence of biologically relevant functional modules in these networks has been theorized by many researchers. However, the application of traditional clustering algorithms for extracting these modules has not been successful, largely due to the presence of noisy false positive interactions as well as specific topological challenges in the network. Results: In this article, we propose an ensemble clustering framework to address this problem. For base clustering, we introduce two topology-based distance metrics to counteract the effects of noise. We develop a PCA-based consensus clustering technique, designed to reduce the dimensionality of the consensus problem and yield informative clusters. We also develop a soft consensus clustering variant to assign multifaceted proteins to multiple functional groups. We conduct an empirical evaluation of different consensus techniques using topology-based, information theoretic and domain-specific validation metrics and show that our approaches can provide significant benefits over other state-of-the-art approaches. Our analysis of the consensus clusters obtained demonstrates that ensemble clustering can (a) produce improved biologically significant functional groupings; and (b) facilitate soft clustering by discovering multiple functional associations for proteins. Contact: srini@cse.ohio-state.edu Supplementary information: Supplementary data are available at Bioinformatics online.