Using universal linguistic knowledge to guide grammar induction

  • Authors:
  • Tahira Naseem;Harr Chen;Regina Barzilay;Mark Johnson

  • Affiliations:
  • Massachusetts Institute of Technology;Massachusetts Institute of Technology;Massachusetts Institute of Technology;Macquarie University

  • Venue:
  • EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

We present an approach to grammar induction that utilizes syntactic universals to improve dependency parsing across a range of languages. Our method uses a single set of manually-specified language-independent rules that identify syntactic dependencies between pairs of syntactic categories that commonly occur across languages. During inference of the probabilistic model, we use posterior expectation constraints to require that a minimum proportion of the dependencies we infer be instances of these rules. We also automatically refine the syntactic categories given in our coarsely tagged input. Across six languages our approach outperforms state-of-the-art unsupervised methods by a significant margin.