Position Models and Language Modeling

  • Authors:
  • Arnaud Zdziobeck;Franck Thollard

  • Affiliations:
  • Laboratoire Hubert Curien, UMR CNRS 5516, Université de Lyon, Université Jean Monnet, Saint-Étienne,;Laboratoire Hubert Curien, UMR CNRS 5516, Université de Lyon, Université Jean Monnet, Saint-Étienne,

  • Venue:
  • SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In statistical language modelling the classic model used is n -gram. This model is not able however to capture long term dependencies, i.e. dependencies larger than n . An alternative to this model is the probabilistic automaton. Unfortunately, it appears that preliminary experiments on the use of this model in language modelling is not yet competitive, partly because it tries to model too long term dependencies. We propose here to improve the use of this model by restricting the dependency to a more reasonable value. Experiments shows an improvement of 45% reduction in the perplexity obtained on the Wall Street Journal language modeling task.