Linguistically-motivated grammar extraction, generalization and adaptation

  • Authors:
  • Yu-Ming Hsieh;Duen-Chi Yang;Keh-Jiann Chen

  • Affiliations:
  • Institute of Information Science, Academia Sinica, Taipei;Institute of Information Science, Academia Sinica, Taipei;Institute of Information Science, Academia Sinica, Taipei

  • Venue:
  • IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

In order to obtain a high precision and high coverage grammar, we proposed a model to measure grammar coverage and designed a PCFG parser to measure efficiency of the grammar. To generalize grammars, a grammar binarization method was proposed to increase the coverage of a probabilistic context-free grammar. In the mean time linguistically-motivated feature constraints were added into grammar rules to maintain precision of the grammar. The generalized grammar increases grammar coverage from 93% to 99% and bracketing F-score from 87% to 91% in parsing Chinese sentences. To cope with error propagations due to word segmentation and part-of-speech tagging errors, we also proposed a grammar blending method to adapt to such errors. The blended grammar can reduce about 20~30% of parsing errors due to error assignment of pos made by a word segmentation system.