Ortholog clustering on a multipartite graph

Authors:
Akshay Vashist;Casimir Kulikowski;Ilya Muchnik
Affiliations:
Department of Computer Science;Department of Computer Science;Department of Computer Science
Venue:
WABI'05 Proceedings of the 5th International conference on Algorithms in Bioinformatics
Year:
2005

Citing 5
Cited 3

Fibonacci heaps and their uses in improved network optimization algorithms

Journal of the ACM (JACM)
Smallest-last ordering and clustering and graph coloring algorithms

Journal of the ACM (JACM)
On bipartite and multipartite clique problems

Journal of Algorithms
Introduction to Algorithms

Introduction to Algorithms
Whole-genome comparative annotation and regulatory motif discovery in multiple yeast species

RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology

Combinatorial and statistical methods for part selection for object recognition

International Journal of Computer Mathematics - Computer Vision and Pattern Recognition
A new combinatorial approach to supervised learning: application to gait recognition

AMFG'05 Proceedings of the Second international conference on Analysis and Modelling of Faces and Gestures
Protein function annotation based on ortholog clusters extracted from incomplete genomes using combinatorial optimization

RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a method for automatically extracting groups of orthologous genes from a large set of genomes through the development of a new clustering method on a weighted multipartite graph. The method assigns a score to an arbitrary subset of genes from multiple genomes to assess the orthologous relationships between genes in the subset. This score is computed using sequence similarities between the member genes and the phylogenetic relationship between the corresponding genomes. An ortholog cluster is found as the subset with highest score, so ortholog clustering is formulated as a combinatorial optimization problem. The algorithm for finding an ortholog cluster runs in time O(|E| + |V| log |V|), where V and E are the sets of vertices and edges, respectively in the graph. However, if we discretize the similarity scores into a constant number of bins, the run time improves to O(|E| + |V|). The proposed method was applied to seven complete eukaryote genomes on which manually curated ortholog clusters, KOG (eukaryotic ortholog clusters, http://www.ncbi.nlm.nih.gov/COG/new/) are constructed. A comparison of our results with the manually curated ortholog clusters shows that our clusters are well correlated with the existing clusters. Finally, we demonstrate how gene order information can be incorporated in the proposed method for improving ortholog detection.