Protein complex prediction via cost-based clustering
Bioinformatics
Iterative Cluster Analysis of Protein Interaction Data
Bioinformatics
BIBE '05 Proceedings of the Fifth IEEE Symposium on Bioinformatics and Bioengineering
Computational Biology and Chemistry
Hi-index | 0.00 |
Since cellular functionality is typically envisioned as having a hierarchical structure, we propose a framework to identify modules (or clusters) within protein-protein interaction (PPI) networks in this paper. Based on the within-module and between-module edges of subgraphs and degree distribution, we present a formal module definition in PPI networks. Using the new module definition, an effective quantitative measure is introduced for the evaluation of the partition of PPI networks. Because of the hierarchical nature of functional modules, a hierarchical agglomerative clustering algorithm is developed based on the new measure in order to solve the problem of complexes detection within PPI networks. We use gold standard sets of protein complexes to validate the biological significance of predicted complexes. A comprehensive comparison is performed between our method and other four representative methods. The results show that our algorithm finds more protein complexes with high biological significance and a significant improvement. Furthermore, the predicted complexes by our method, whether dense or sparse, match well with known biological characteristics.