Variable Order Finite-Context Models in DNA Sequence Coding

  • Authors:
  • Daniel A. Martins;António J. Neves;Armando J. Pinho

  • Affiliations:
  • Signal Processing Lab, DETI / IEETA, University of Aveiro, Aveiro, Portugal 3810---193;Signal Processing Lab, DETI / IEETA, University of Aveiro, Aveiro, Portugal 3810---193;Signal Processing Lab, DETI / IEETA, University of Aveiro, Aveiro, Portugal 3810---193

  • Venue:
  • IbPRIA '09 Proceedings of the 4th Iberian Conference on Pattern Recognition and Image Analysis
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Being an essential key in biological research, the DNA sequences are often shared between researchers and digitally stored for future use. As these sequences grow in volume, it also grows the need to encode them, thus saving space for more sequences. Besides this, a better coding method corresponds to a better model of the sequence, allowing new insights about the DNA structure. In this paper, we present an algorithm capable of improving the encoding results of algorithms that depend of low-order finite-context models to encode DNA sequences. To do so, we implemented a variable order finite-context model, supported by a predictive function. The proposed algorithm allows using three finite-context models at once without requiring the inclusion of side information in the encoded sequence. Currently, the proposed method shows small improvements in the encoding results when compared with same order finite-context models. However, we also present results showing that there is space for further improvements regarding the use variable order finite-context models for DNA sequence coding.