Theoretical Computer Science
The Los Alamos hepatitis C sequence database
Bioinformatics
Coding with variable block maps
Theoretical Computer Science
A universal data compression system
IEEE Transactions on Information Theory
Hi-index | 5.23 |
We present the variable length local decoding, a method which augments the alphabet of a sequence or a set of sequences. Roughly speaking, the approach distinguishes several types of symbols/nucleotides according to their contexts in the sequences. These contexts have variable lengths and are defined from a prefix code. We first give an original algorithm computing the decoding with a complexity linear both in time and memory space. Next, the approach is applied to alignment-free sequence comparison. We give a heuristic way to select context lengths relevant to this question. The comparison of sequences itself is based on the composition in ''augmented'' symbols of their variable length local decodings. The results of this comparison are illustrated on a biological alignment.