Approximating the true evolutionary distance between two genomes

Authors:
Krister M. Swenson;Mark Marron;Joel V. Earnest-Deyoung;Bernard M. E. Moret
Affiliations:
EPFL, Lausanne, Switzerland;University of New Mexico, Albuquerque, New Mexico;University of New Mexico, Albuquerque, New Mexico;EPFL and the Swiss Institute of Bioinformatics, Lausanne, Switzerland
Venue:
Journal of Experimental Algorithmics (JEA)
Year:
2008

Citing 6
Cited 2

Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals

STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Sorting by reversals is difficult

RECOMB '97 Proceedings of the first annual international conference on Computational molecular biology
Formulations and hardness of multiple sorting by reversals

RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
Steps toward accurate reconstructions of phylogenies from gene-order data

Journal of Computer and System Sciences - Computational biology 2002
Genomic distances under deletions and insertions

Theoretical Computer Science - Special papers from: COCOON 2003
Assignment of Orthologous Genes via Genome Rearrangement

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Finding All Sorting Tandem Duplication Random Loss Operations

CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
Finding all sorting tandem duplication random loss operations

Journal of Discrete Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

As more and more genomes are sequenced, evolutionary biologists are becoming increasingly interested in evolution at the level of whole genomes, in scenarios in which the genome evolves through insertions, duplications, deletions, and movements of genes along its chromosomes. In the mathematical model pioneered by Sankoff and others, a unichromosomal genome is represented by a signed permutation of a multiset of genes; Hannenhalli and Pevzner showed that the edit distance between two signed permutations of the same set can be computed in polynomial time when all operations are inversions. El-Mabrouk extended that result to allow deletions and a limited form of insertions (which forbids duplications); in turn we extended it to compute a nearly optimal edit sequence between an arbitrary genome and the identity permutation. In this paper we generalize our approach to compute distances between two arbitrary genomes, but focus on approximating the true evolutionary distance rather than the edit distance. We present experimental results showing that our algorithm produces excellent estimates of the true evolutionary distance up to a (high) threshold of saturation; indeed, the distances thus produced are good enough to enable the simple neighbor-joining procedure to reconstruct our test trees with high accuracy.