Fast and Adaptive Variable Order Markov Chain Construction

  • Authors:
  • Marcel H. Schulz;David Weese;Tobias Rausch;Andreas Döring;Knut Reinert;Martin Vingron

  • Affiliations:
  • Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany 14195 and International Max Planck Research School for Computational Biology and Scienti ...;Department of Computer Science, Free University of Berlin, Berlin, Germany 14195;Department of Computer Science, Free University of Berlin, Berlin, Germany 14195 and International Max Planck Research School for Computational Biology and Scientific Computing,;Department of Computer Science, Free University of Berlin, Berlin, Germany 14195;Department of Computer Science, Free University of Berlin, Berlin, Germany 14195;Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany 14195

  • Venue:
  • WABI '08 Proceedings of the 8th international workshop on Algorithms in Bioinformatics
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Variable order Markov chains (VOMCs) are a flexible class of models that extend the well-known Markov chains. They have been applied to a variety of problems in computational biology, e.g. protein family classification. A linear time and space construction algorithm has been published in 2000 by Apostolico and Bejerano. However, neither a report of the actual running time nor an implementation of it have been published since. In this paper we use the lazy suffix tree and the enhanced suffix array to improve upon the algorithm of Apostolico and Bejerano. We introduce a new software which is orders of magnitude faster than current tools for building VOMCs, and is suitable for large scale sequence analysis.