Large-scale corpus-driven PCFG approximation of an HPSG

  • Authors:
  • Yi Zhang;Hans-Ulrich Krieger

  • Affiliations:
  • LT-Lab, DFKI GmbH, Saarbrücken, Germany;LT-Lab, DFKI GmbH, Saarbrücken, Germany

  • Venue:
  • IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a novel corpus-driven approach towards grammar approximation for a linguistically deep Head-driven Phrase Structure Grammar. With an unlexicalized probabilistic context-free grammar obtained by Maximum Likelihood Estimate on a large-scale automatically annotated corpus, we are able to achieve parsing accuracy higher than the original HPSG-based model. Different ways of enriching the annotations carried by the approximating PCFG are proposed and compared. Comparison to the state-of-the-art latent-variable PCFG shows that our approach is more suitable for the grammar approximation task where training data can be acquired automatically. The best approximating PCFG achieved ParsEv-al F1 accuracy of 84.13%. The high robustness of the PCFG suggests it is a viable way of achieving full coverage parsing with the hand-written deep linguistic grammars.