Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Counting permutations with given cycle structure and descent set
Journal of Combinatorial Theory Series A
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Zipping Out Relevant Information
Computing in Science and Engineering
Invited Lecture: The Burrows-Wheeler Transform: Theory and Practice
MFCS '99 Proceedings of the 24th International Symposium on Mathematical Foundations of Computer Science
Burrows--Wheeler transform and Sturmian words
Information Processing Letters
An Extension of the Burrows Wheeler Transform to k Words
DCC '05 Proceedings of the Data Compression Conference
A note on the Burrows-Wheeler transformation
Theoretical Computer Science
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
IEEE Transactions on Information Theory
IEEE Transactions on Information Theory
An extension of the Burrows–Wheeler Transform
Theoretical Computer Science
Distance measures for biological sequences: Some recent approaches
International Journal of Approximate Reasoning
Hi-index | 0.00 |
In this paper we introduce a new alignment-free method for comparing sequences which is combinatorial by nature and does not use any compressor nor any information-theoretic notion. Such a method is based on an extension of the Burrows-Wheeler Transform, a transformation widely used in the context of Data Compression. The new extended transformation takes as input a multiset of sequences and produces as output a string obtained by a suitable rearrangement of the characters of all the input sequences. By using such a transformation we define a measure to compare sequences that takes into account how the characters coming from different input sequences are mixed in the output string. Such a method is tested on a real data set for the whole mitochondrial genome phylogeny problem. However, the goal of this paper is to introduce a new and general methodology for automatic categorization of sequences.