Unsupervised learning of dependency structure for language modeling

  • Authors:
  • Jianfeng Gao;Hisami Suzuki

  • Affiliations:
  • Microsoft Research, Asia, Beijing, China;Microsoft Research, Redmond WA

  • Venue:
  • ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a dependency language model (DLM) that captures linguistic constraints via a dependency structure, i.e., a set of probabilistic dependencies that express the relations between headwords of each phrase in a sentence by an acyclic, planar, undirected graph. Our contributions are three-fold. First, we incorporate the dependency structure into an n-gram language model to capture long distance word dependency. Second, we present an unsupervised learning method that discovers the dependency structure of a sentence using a bootstrapping procedure. Finally, we evaluate the proposed models on a realistic application (Japanese Kana-Kanji conversion). Experiments show that the best DLM achieves an 11.3% error rate reduction over the word trigram model.