A dynamic language model based on individual word domains

  • Authors:
  • E. I. Sicilia-Garcia;Ji Ming;F. J. Smith

  • Affiliations:
  • Queen's University of Belfast, Belfast, Northern Ireland;Queen's University of Belfast, Belfast, Northern Ireland;Queen's University of Belfast, Belfast, Northern Ireland

  • Venue:
  • COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a new statistical language model based on a combination of individual word language models. Each word model is built from an individual corpus which is formed by extracting those subsets of the entire training corpus which contain that significant word. We also present a novel way of combining language models called the "union model", based on a logical union of intersections, and use this to combine the language models obtained for the significant words from a cache. The initial results with the new model provide a 20% reduction in language model perplexity over the standard 3-gram approach.