Formulations and hardness of multiple sorting by reversals
RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals
Journal of the ACM (JACM)
WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
The Median Problem for Breakpoints in Comparative Genomics
COCOON '97 Proceedings of the Third Annual International Conference on Computing and Combinatorics
Transforming men into mice (polynomial algorithm for genomic distance problem)
FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Combinatorics of Genome Rearrangements
Combinatorics of Genome Rearrangements
The zero exemplar distance problem
RECOMB-CG'10 Proceedings of the 2010 international conference on Comparative genomics
A practical algorithm for ancestral rearrangement reconstruction
WABI'11 Proceedings of the 11th international conference on Algorithms in bioinformatics
Hi-index | 0.00 |
We present a comparative genomics approach for inferring ancestral genome organization and evolutionary scenarios, based on a model accounting for content-modifying operations. More precisely, we focus on comparing two ordered gene sequences with duplicated genes that have evolved from a common ancestor through duplications and losses; our model can be grouped in the class of "Block Edit" models. From a combinatorial point of view, the main consequence is the possibility of formulating the problem as an alignment problem. On the other hand, in contrast to symmetrical metrics such as the inversion distance, duplications and losses are asymmetrical operations that are applicable to one of the two aligned sequences. Consequently, an ancestral genome can directly be inferred from a duplication-loss scenario attached to a given alignment. Although alignments are a priori simpler to handle than rearrangements, we show that a direct approach based on dynamic programming leads, at best, to an efficient heuristic. We present an exact pseudo-boolean linear programming algorithm to search for the optimal alignment along with an optimal scenario of duplications and losses. Although exponential in the worst case, we show low running times on real datasets as well as synthetic data. We apply our algorithm in a phylogenetic context to the evolution of stable RNA (tRNA and rRNA) gene content and organization in Bacillus genomes. Our results lead to various biological insights, such as rates of ribosomal RNA proliferation among lineages, their role in altering tRNA gene content, and evidence of tRNA class conversion.