Minimized models and grammar-informed initialization for supertagging with highly ambiguous lexicons

  • Authors:
  • Sujith Ravi;Jason Baldridge;Kevin Knight

  • Affiliations:
  • University of Southern California, Marina del Rey, California;The University of Texas at Austin, Austin, Texas;University of Southern California, Marina del Rey, California

  • Venue:
  • ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We combine two complementary ideas for learning supertaggers from highly ambiguous lexicons: grammar-informed tag transitions and models minimized via integer programming. Each strategy on its own greatly improves performance over basic expectation-maximization training with a bitag Hidden Markov Model, which we show on the CCGbank and CCG-TUT corpora. The strategies provide further error reductions when combined. We describe a new two-stage integer programming strategy that efficiently deals with the high degree of ambiguity on these datasets while obtaining the full effect of model minimization.