Whole genome duplications, multi-break rearrangements, and genome halving problem

  • Authors:
  • Max A. Alekseyev;Pavel A. Pevzner

  • Affiliations:
  • -;University of California at San Diego

  • Venue:
  • SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Genome Halving Problem, motivated by the whole genome duplication events in molecular evolution, was solved by El-Mabrouk and Sankoff. The El-Mabrouk-Sankoff algorithm is rather complex inspiring a quest for a simpler solution. An alternative approach to Genome Halving Problem based on the notion of the contracted breakpoint graph was recently proposed in [2]. This new technique reveals that while the El-Mabrouk-Sankoff result is correct in most cases, it does not hold in the case of unichromosomal genomes. This raises a problem of correcting El-Mabrouk-Sankoff analysis and devising an algorithm that deals adequately with all genomes. In this paper we efficiently classify all genomes into two classes and show that while the El-Mabrouk-Sankoff theorem holds for the first class, it is incorrect for the second class. The crux of our analysis is a new combinatorial invariant defined on duplicated permutations. Using this invariant we were able to come up with a full proof of the Genome Halving theorem and a polynomial algorithm for Genome Halving Problem (for unichromosomal genomes). We also give the first short proof of the original El-Mabrouk-Sankoff result for multichromosomal genomes. Finally, we discuss a generalization of Genome Halving Problem for a more general set of rearrangement operations (including transpositions) and propose an efficient algorithm for solving this problem.