Smoothing fine-grained PCFG lexicons

  • Authors:
  • Tejaswini Deoskar;Mats Rooth;Khalil Sima'an

  • Affiliations:
  • University of Amsterdam;Cornell University;University of Amsterdam

  • Venue:
  • IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an approach for smoothing treebank-PCFG lexicons by interpolating treebank lexical parameter estimates with estimates obtained from unannotated data via the Inside-outside algorithm. The PCFG has complex lexical categories, making relative-frequency estimates from a treebank very sparse. This kind of smoothing for complex lexical categories results in improved parsing performance, with a particular advantage in identifying obligatory arguments subcategorized by verbs unseen in the treebank.