Complexity profiles of DNA sequences using finite-context models
USAB'11 Proceedings of the 7th conference on Workgroup Human-Computer Interaction and Usability Engineering of the Austrian Computer Society: information Quality in e-Health
Compression of whole genome alignments using a mixture of finite-context models
ICIAR'12 Proceedings of the 9th international conference on Image Analysis and Recognition - Volume Part I
Hi-index | 0.00 |
The interest in DNA coding has been growing with the availability of extensive genomic databases. Although only two bits are sufficient to encode the four DNA bases, efficient lossless compression methods are still needed due to the size of DNA sequences and because standard compression algorithms do not perform well on DNA sequences. As a result, several specific coding methods have been proposed. Most of these methods are based on searching procedures for finding exact or approximate repeats. Low order finite-context models have only been used as secondary, fall back mechanisms. In this paper, we show that finite-context models can also be used as main DNA encoding methods. We propose a coding method based on two finite-context models that compete for the encoding of data, on a block by block basis. The experimental results confirm the effectiveness of the proposed method.