Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals
Journal of the ACM (JACM)
Communications of the ACM
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Consensus Genetic Maps: A Graph Theoretic Approach
CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
Reversal distance for partially ordered genomes
Bioinformatics
RCG'05 Proceedings of the 2005 international conference on Comparative Genomics
Conserved interval distance computation between non-trivial genomes
COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
Conservation of combinatorial structures in evolution scenarios
RCG'04 Proceedings of the 2004 RECOMB international conference on Comparative Genomics
Breakpoint distance and PQ-trees
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Hi-index | 0.00 |
Preliminary to most comparative genomics studies is the annotation of chromosomes as ordered sequences of genes. Unfortunately, different genetic mapping techniques usually give rise to different maps with unequal gene content, and often containing sets of unordered neighboring genes. Only partial orders can thus be obtained from combining such maps. However, once a total order O is known for a given genome, it can be used as a reference to order genes of a closely related species characterized by a partial order P. In this paper, the problem is to find a linearization of P that is as close as possible to O in term of the breakpoint distance. We first prove an NP-complete complexity result for this problem. We then give a dynamic programming algorithm whose running time is exponential for general partial orders, but polynomial when the partial order is derived from a bounded number of genetic maps. A time-efficient greedy heuristic is then given for the general case, with a performance higher than 90% on simulated data. Applications to the analysis of grass genomes are presented.