Sequence Modeling with Mixtures of Conditional Maximum Entropy Distributions

  • Authors:
  • Dmitry Pavlov

  • Affiliations:
  • -

  • Venue:
  • ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a novel approach to modeling sequences usingmixtures of conditional maximum entropy (maxent) distributions.Our method generalizes the mixture of first-orderMarkov models by including the "long-term" dependenciesin model components.The "long-term" dependenciesare represented by the frequently used in the naturallanguage processing (NLP) domain probabilistic triggersor rules (suc as "A occured k positions back" \Longrightarrow"the current symbol is B" with probability P).The maxentframework is then used to create a coherent global probabilisticmodel from all selected triggers.In this paper, weenhance this formalism by using probabilistic mixtures withmaxent models as components, thus representing hidden orunobserved effects in the data.We demonstrate how ourmixture of conditional maxent models can be learned fromdata using the generalized EM algorithm that scales linearlyin the dimensions of the data and the number of mixturecomponents.We present empirical results on the simulatedand real-world data sets and demonstrate that theproposed approach enables us to create better quality modelsthan the mixtures of first-order Markov models and resistoverfitting and curse of dimensionality that would inevitablypresent themselves for the higher order Markov models.