Weighted genomic distance can hardly impose a bound on the proportion of transpositions

  • Authors:
  • Shuai Jiang;Max A. Alekseyev

  • Affiliations:
  • Department of Computer Science and Engineering, University of South Carolina, Columbia, SC;Department of Computer Science and Engineering, University of South Carolina, Columbia, SC

  • Venue:
  • RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

Genomic distance between two genomes, i.e., the smallest number of genome rearrangements required to transform one genome into the other, is often used as a measure of evolutionary closeness of the genomes in comparative genomics studies. However, in models that include rearrangements of significantly different "power" such as reversals (that are "weak" and most frequent rearrangements) and transpositions (that are more "powerful" but rare), the genomic distance typically corresponds to a transformation with a large proportion of transpositions, which is not biologically adequate. Weighted genomic distance is a traditional approach to bounding the proportion of transpositions by assigning them a relative weight α 1. A number of previous studies addressed the problem of computing weighted genomic distance with α ≤ 2. Employing the model of multi-break rearrangements on circular genomes, that captures both reversals (modelled as 2-breaks) and transpositions (modelled as 3-breaks), we prove that for α ∈ (1, 2), a minimumweight transformation may entirely consist of transpositions, implying that the corresponding weighted genomic distance does not actually achieve its purpose of bounding the proportion of transpositions. We further prove that for α ∈ (1, 2), the minimum-weight transformations do not depend on a particular choice of a from this interval. We give a complete characterization of such transformations and show that they coincide with the transformations that at the same time have the shortest length and make the smallest number of breakages in the genomes. Our results also provide a theoretical foundation for the empirical observation that for α