Semi-automatically developing Chinese HPSG grammar from the Penn Chinese Treebank for deep parsing

  • Authors:
  • Kun Yu;Yusuke Miyao;Xiangli Wang;Takuya Matsuzaki;Junichi Tsujii

  • Affiliations:
  • The University of Tokyo;National Institute of Informatics;The University of Tokyo;The University of Tokyo;The University of Tokyo and The University of Manchester

  • Venue:
  • COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we introduce our recent work on Chinese HPSG grammar development through treebank conversion. By manually defining grammatical constraints and annotation rules, we convert the bracketing trees in the Penn Chinese Treebank (CTB) to be an HPSG treebank. Then, a large-scale lexicon is automatically extracted from the HPSG treebank. Experimental results on the CTB 6.0 show that a HPSG lexicon was successfully extracted with 97.24% accuracy; furthermore, the obtained lexicon achieved 98.51% lexical coverage and 76.51% sentential coverage for unseen text, which are comparable to the state-of-the-art works for English.