Markov clustering of protein interaction networks with improved balance and scalability

  • Authors:
  • Venu Satuluri;Srinivasan Parthasarathy;Duygu Ucar

  • Affiliations:
  • The Ohio State University;The Ohio State University;University of Iowa

  • Venue:
  • Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Markov Clustering (MCL) is a popular algorithm for clustering networks in bioinformatics such as protein-protein interaction networks and protein similarity networks. An important requirement when clustering protein networks is minimizing the number of big clusters, since it is generally understood that protein complexes tend not to have more than 15--30 nodes. Similarly, it is important to not output too many singleton clusters, since they do not provide much useful information. In this paper, we show how MCL may be modified so as to better respect these two requirements, while also taking the link structure in the graph into account. We design our algorithm on top of Regularized MCL (R-MCL) [16], a previously proposed modification of MCL. Our proposed variation computes a new regularization matrix at each iteration that penalizes big cluster sizes, with the size of the penalty being tunable using a balance parameter. This algorithm also naturally fits in a Multi level framework that allows great improvements in speed. We perform experiments on three real protein interaction networks and show significant improvements over MCL in quality, balance and execution speed.