A deterministic method to predict phrase boundaries of a syntactic tree

  • Authors:
  • Zhaoxia Dong;Tiejun Zhao

  • Affiliations:
  • MOE-MS Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin, China;MOE-MS Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology, Harbin, China

  • Venue:
  • ICIC'10 Proceedings of the Advanced intelligent computing theories and applications, and 6th international conference on Intelligent computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a deterministic model to predict all the phrase boundaries of a syntactic tree, including base constituent boundaries and nested constituent boundaries. The model only uses the word and part-of-speech (POS) information, while general parsers also use the phrase type information. Our model is divided into two stages and finally turned into four classification sub-models. The f-score of our model is comparable to Stanford parser's PCFG model and factored model when tested on Penn Treebank Section 23 using gold-standard POS tags, which shows that phrase boundary identification could be done without phrase labels and could achieve comparable result to Stanford parser.