Using a VOM model for reconstructing potential coding regions in EST sequences

  • Authors:
  • Armin Shmilovici;Irad Ben-Gal

  • Affiliations:
  • Department of Information Systems Engineering, Ben-Gurion University, Beer-Sheva, Israel;Department of Industrial Engineering, Tel-Aviv University, Tel-Aviv, Israel 69978

  • Venue:
  • Computational Statistics
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a method for annotating coding and noncoding DNA regions by using variable order Markov (VOM) models. A main advantage in using VOM models is that their order may vary for different sequences, depending on the sequences' statistics. As a result, VOM models are more flexible with respect to model parameterization and can be trained on relatively short sequences and on low-quality datasets, such as expressed sequence tags (ESTs). The paper presents a modified VOM model for detecting and correcting insertion and deletion sequencing errors that are commonly found in ESTs. In a series of experiments the proposed method is found to be robust to random errors in these sequences.