Research Article: A degree-distribution based hierarchical agglomerative clustering algorithm for protein complexes identification

  • Authors:
  • Liang Yu;Lin Gao;Kui Li;Yi Zhao;David K. Y. Chiu

  • Affiliations:
  • School of Computer Science and Technology Xidian University, Xi'an 710071, China;School of Computer Science and Technology Xidian University, Xi'an 710071, China;School of Computer Science and Technology Xidian University, Xi'an 710071, China;Bioinformatics Group, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China;School of Computer Science, University of Guelph, Guelph, Canada

  • Venue:
  • Computational Biology and Chemistry
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Since cellular functionality is typically envisioned as having a hierarchical structure, we propose a framework to identify modules (or clusters) within protein-protein interaction (PPI) networks in this paper. Based on the within-module and between-module edges of subgraphs and degree distribution, we present a formal module definition in PPI networks. Using the new module definition, an effective quantitative measure is introduced for the evaluation of the partition of PPI networks. Because of the hierarchical nature of functional modules, a hierarchical agglomerative clustering algorithm is developed based on the new measure in order to solve the problem of complexes detection within PPI networks. We use gold standard sets of protein complexes to validate the biological significance of predicted complexes. A comprehensive comparison is performed between our method and other four representative methods. The results show that our algorithm finds more protein complexes with high biological significance and a significant improvement. Furthermore, the predicted complexes by our method, whether dense or sparse, match well with known biological characteristics.