A Unified Approach for Reconstructing Ancient Gene Clusters

Authors:
Jens Stoye;Roland Wittler
Affiliations:
Bielefeld University, Bielefeld;Bielefeld University, Bielefeld
Venue:
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Year:
2009

Citing 5
Cited 5

On weighted multiway cuts in trees

Mathematical Programming: Series A and B
Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
Pattern Discovery in Bioinformatics: Theory & Algorithms

Pattern Discovery in Bioinformatics: Theory & Algorithms
Computation of median gene clusters

RECOMB'08 Proceedings of the 12th annual international conference on Research in computational molecular biology
The incompatible desiderata of gene cluster properties

RCG'05 Proceedings of the 2005 international conference on Comparative Genomics

Minimal Conflicting Sets for the Consecutive Ones Property in Ancestral Genome Reconstruction

RECOMB-CG '09 Proceedings of the International Workshop on Comparative Genomics
Consistency of sequence-based gene clusters

RECOMB-CG'10 Proceedings of the 2010 international conference on Comparative genomics
A polynomial-time algorithm for finding a minimal conflicting set containing a given row

CSR'11 Proceedings of the 6th international conference on Computer science: theory and applications
Faster and simpler minimal conflicting set identification

CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Rearrangement-Based Phylogeny Using the Single-Cut-or-Join Operation

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The order of genes in genomes provides extensive information. In comparative genomics, differences or similarities of gene orders are determined to predict functional relations of genes or phylogenetic relations of genomes. For this purpose, various combinatorial models can be used to identify gene clusters—groups of genes that are colocated in a set of genomes. We introduce a unified approach to model gene clusters and define the problem of labeling the inner nodes of a given phylogenetic tree with sets of gene clusters. Our optimization criterion in this context combines two properties: parsimony, i.e., the number of gains and losses of gene clusters has to be minimal, and consistency, i.e., for each ancestral node, there must exist at least one potential gene order that contains all the reconstructed clusters. We present and evaluate an exact algorithm to solve this problem. Despite its exponential worst-case time complexity, our method is suitable even for large-scale data. We show the effectiveness and efficiency on both simulated and real data.