Computing the Summed Adjacency Disruption Number between Two Genomes with Duplicate Genes Using Pseudo-Boolean Optimization

  • Authors:
  • João Delgado;Inês Lynce;Vasco Manquinho

  • Affiliations:
  • IST/INESC-ID, Technical University of Lisbon, Portugal;IST/INESC-ID, Technical University of Lisbon, Portugal;IST/INESC-ID, Technical University of Lisbon, Portugal

  • Venue:
  • RECOMB-CG '09 Proceedings of the International Workshop on Comparative Genomics
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The increasing number of fully sequenced genomes has led to the study of genome rearrangements. Several approaches have been proposed to solve this problem, all of them being either too complex to be solved efficiently or too simple to be applied to genomes of complex organisms. The latest challenge has been to overcome the problem of having genomes with duplicate genes. This led to the definition of matching models and similarity measures. The idea is to find a matching between genes in two genomes, in order to disambiguate the data of duplicate genes and calculate a similarity measure. The problem becomes that of finding a matching that best preserves the order of genes in two genomes, where gene order is evaluated by a chosen similarity measure. This paper presents a new pseudo-Boolean encoding for computing the exact summed adjacency disruption number for two genomes with duplicate genes. Experimental results on a *** -Proteobacteria data set illustrate the approach.