Propositionalisation of Profile Hidden Markov Models for Biological Sequence Analysis

  • Authors:
  • Stefan Mutter;Bernhard Pfahringer;Geoffrey Holmes

  • Affiliations:
  • Department of Computer Science, The University of Waikato, Hamilton, New Zealand;Department of Computer Science, The University of Waikato, Hamilton, New Zealand;Department of Computer Science, The University of Waikato, Hamilton, New Zealand

  • Venue:
  • AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hidden Markov Models are a widely used generative model for analysing sequence data. A variant, Profile Hidden Markov Models are a special case used in Bioinformatics to represent, for example, protein families. In this paper we introduce a simple propositionalisation method for Profile Hidden Markov Models. The method allows the use of PHMMs discriminatively in a classification task. Previously, kernel approaches have been proposed to generate a discriminative description for an HMM, but require the explicit definition of a similarity measure for HMMs. Propositionalisation does not need such a measure and allows the use of any propositional learner including kernel-based approaches. We show empirically that using propositionalisation leads to higher accuracies in comparison with PHMMs on benchmark datasets.