A new combinatorial approach to sequence comparison

  • Authors:
  • S. Mantaci;A. Restivo;G. Rosone;M. Sciortino

  • Affiliations:
  • Dipartimento di Matematica ed Applicationi, University of Palermo, Palermo, Italy;Dipartimento di Matematica ed Applicationi, University of Palermo, Palermo, Italy;Dipartimento di Matematica ed Applicationi, University of Palermo, Palermo, Italy;Dipartimento di Matematica ed Applicationi, University of Palermo, Palermo, Italy

  • Venue:
  • ICTCS'05 Proceedings of the 9th Italian conference on Theoretical Computer Science
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we introduce a new alignment-free method for comparing sequences which is combinatorial by nature and does not use any compressor nor any information-theoretic notion. Such a method is based on an extension of the Burrows-Wheeler Transform, a transformation widely used in the context of Data Compression. The new extended transformation takes as input a multiset of sequences and produces as output a string obtained by a suitable rearrangement of the characters of all the input sequences. By using such a transformation we define a measure to compare sequences that takes into account how the characters coming from different input sequences are mixed in the output string. Such a method is tested on a real data set for the whole mitochondrial genome phylogeny problem. However, the goal of this paper is to introduce a new and general methodology for automatic categorization of sequences.