Probabilistic Arithmetic Automata and Their Application to Pattern Matching Statistics

  • Authors:
  • Tobias Marschall;Sven Rahmann

  • Affiliations:
  • Bioinformatics for High-Throughput Technologies at the Chair of Algorithm Engineering, Computer Science Department, TU Dortmund, Dortmund, Germany D-44221;Bioinformatics for High-Throughput Technologies at the Chair of Algorithm Engineering, Computer Science Department, TU Dortmund, Dortmund, Germany D-44221

  • Venue:
  • CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present probabilistic arithmetic automata (PAAs), which can be used to model chains of operations whose operands depend on chance. We provide two different algorithms to exactly calculate the distribution of the results obtained by such probabilistic calculations. Although we introduce PAAs and the corresponding algorithm in a generic manner, our main concern is their application to pattern matching statistics, i.e. we study the distributions of the number of occurrences of a pattern under a given text model. Such calculations play an important role in computational biology as they give access to the significance of pattern occurrences. To assess the practicability of our method, we apply it to the Prosite database of amino acid motifs and to the Jaspar database of transcription factor binding sites. Regarding the latter, we additionally show that our framework permits to take binding affinities predicted from a physical model into account.