A three level cache-based adaptive chinese language model

  • Authors:
  • Junlin Zhang;Le Sun;Weimin Qu;Lin Du;Yufang Sun

  • Affiliations:
  • Institute of Software, Chinese Academy of Sciences, Beijing, P.R. China;Institute of Software, Chinese Academy of Sciences, Beijing, P.R. China;Institute of Software, Chinese Academy of Sciences, Beijing, P.R. China;Institute of Software, Chinese Academy of Sciences, Beijing, P.R. China;Institute of Software, Chinese Academy of Sciences, Beijing, P.R. China

  • Venue:
  • IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Even if n-grams were proved to be very powerful and robust in various tasks involving language models, they have a certain handicap that the dependency is limited to very short local context because of the Markov assumption. This article presents an improved cache based approach to Chinese statistical language modeling. We extend this model by introducing the Chinese concept lexicon into it. The cache of the extended language model contains not only the words occurred recently but also the semantically related words. Experiments have shown that the performance of the adaptive model has been improved greatly.