Speeding up Bayesian HMM by the four Russians method

Authors:
Md Pavel Mahmud;Alexander Schliep
Affiliations:
Department of Computer Science, Rutgers University, New Jersey;Department of Computer Science, Rutgers University, New Jersey and BioMaPS Institute for Quantitative Biology
Venue:
WABI'11 Proceedings of the 11th international conference on Algorithms in bioinformatics
Year:
2011

Citing 8
Cited 0

A Four Russians algorithm for regular expression pattern matching

Journal of the ACM (JACM)
Algorithmic aspects in speech recognition: an introduction

Journal of Experimental Algorithmics (JEA)
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Two Methods for Improving Performance of a HMM and their Application for Gene Finding

Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Speeding Up HMM Decoding and Training by Exploiting Sequence Repetitions

Algorithmica
A simple, practical and complete O(n³/log n)-time algorithm for RNA folding using the four-Russians speedup

WABI'09 Proceedings of the 9th international conference on Algorithms in bioinformatics
Speeding up HMM decoding and training by exploiting sequence repetitions

CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching

Quantified Score

Hi-index	0.00

Visualization

Abstract

Bayesian computations with Hidden Markov Models (HMMs) are often avoided in practice. Instead, due to reduced running time, point estimates - maximum likelihood (ML) or maximum a posterior (MAP) - are obtained and observation sequences are segmented based on the Viterbi path, even though the lack of accuracy and dependency on starting points of the local optimization are well known. We propose a method to speed-up Bayesian computations which addresses this problem for regular and time-dependent HMMs with discrete observations. In particular, we show that by exploiting sequence repetitions, using the four Russians method, and the conditional dependency structure, it is possible to achieve a Θ(log T) speed-up, where T is the length of the observation sequence. Our experimental results on identification of segments of homogeneous nucleic acid composition, known as the DNA segmentation problem, show that the speed-up is also observed in practice.