Nonlocal language modeling based on context co-occurrence vectors

  • Authors:
  • Sadao Kurohashi;Manabu Ori

  • Affiliations:
  • Kyoto University, Yoshida-honmachi, Sakyo, Kyoto, Japan;Kyoto University, Yoshida-honmachi, Sakyo, Kyoto, Japan

  • Venue:
  • EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a novel nonlocal language model which utilizes contextual information. A reduced vector space model calculated from co-occurrences of word pairs provides word co-occurrence vectors. The sum of word co-occurrence vectors represents the context of a document, and the cosine similarity between the context vector and the word co-occurrence vectors represents the long-distance lexical dependencies. Experiments on the Mainichi Newspaper corpus show significant improvement in perplexity (5.0% overall and 27.2% on target vocabulary)