Adding semantic roles to the chinese treebank

  • Authors:
  • Nianwen Xue;Martha Palmer

  • Affiliations:
  • Department of linguistics and center for spoken language research, university of colorado at boulder, co, u.s.a. e-mail: nianwen.xue@colorado.edu;Department of linguistics and center for spoken language research, university of colorado at boulder, co, u.s.a. e-mail: nianwen.xue@colorado.edu

  • Venue:
  • Natural Language Engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We report work on adding semantic role labels to the Chinese Treebank, a corpus already annotated with phrase structures. The work involves locating all verbs and their nominalizations in the corpus, and semi-automatically adding semantic role labels to their arguments, which are constituents in a parse tree. Although the same procedure is followed, different issues arise in the annotation of verbs and nominalized predicates. For verbs, identifying their arguments is generally straightforward given their syntactic structure in the Chinese Treebank as they tend to occupy well-defined syntactic positions. Our discussion focuses on the syntactic variations in the realization of the arguments as well as our approach to annotating dislocated and discontinuous arguments. In comparison, identifying the arguments for nominalized predicates is more challenging and we discuss criteria and procedures for distinguishing arguments from non-arguments. In particular we focus on the role of support verbs as well as the relevance of event/result distinctions in the annotation of the predicate-argument structure of nominalized predicates. We also present our approach to taking advantage of the syntactic structure in the Chinese Treebank to bootstrap the predicate-argument structure annotation of verbs. Finally, we discuss the creation of a lexical database of frame files and its role in guiding predicate-argument annotation. Procedures for ensuring annotation consistency and inter-annotator agreement evaluation results are also presented.