Speeding up Bayesian HMM by the four Russians method

  • Authors:
  • Md Pavel Mahmud;Alexander Schliep

  • Affiliations:
  • Department of Computer Science, Rutgers University, New Jersey;Department of Computer Science, Rutgers University, New Jersey and BioMaPS Institute for Quantitative Biology

  • Venue:
  • WABI'11 Proceedings of the 11th international conference on Algorithms in bioinformatics
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Bayesian computations with Hidden Markov Models (HMMs) are often avoided in practice. Instead, due to reduced running time, point estimates - maximum likelihood (ML) or maximum a posterior (MAP) - are obtained and observation sequences are segmented based on the Viterbi path, even though the lack of accuracy and dependency on starting points of the local optimization are well known. We propose a method to speed-up Bayesian computations which addresses this problem for regular and time-dependent HMMs with discrete observations. In particular, we show that by exploiting sequence repetitions, using the four Russians method, and the conditional dependency structure, it is possible to achieve a Θ(log T) speed-up, where T is the length of the observation sequence. Our experimental results on identification of segments of homogeneous nucleic acid composition, known as the DNA segmentation problem, show that the speed-up is also observed in practice.