Searching for Multiple Words in a Markov Sequence

  • Authors:
  • Yonil Park;John L. Spouge

  • Affiliations:
  • National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA;National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA

  • Venue:
  • INFORMS Journal on Computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The theory of the discrete-time Markovian arrival process (DMAP) can be applied to some statistical problems encountered when searching for multiple words in a Markov sequence. Such word searches are often emphasized in studies of the human genome. There are several advantages to the DMAP approach we present. Most notably, its derivations are transparent, and they readily unify disparate results about the exact distributions of overlapping and nonoverlapping word counts. We also present several examples and applications of our theory, including a numerical study using a random DNA dataset from the human genome.