Automatic semantic role labeling for Chinese verbs

  • Authors:
  • Nianwen Xue;Martha Palmer

  • Affiliations:
  • CIS Department, University of Pennsylvania, Philadelphia, PA;CIS Department, University of Pennsylvania, Philadelphia, PA

  • Venue:
  • IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent years have seen a revived interest in semantic parsing by applying statistical and machine-learning methods to semantically annotated corpora such as the FrameNet and the Proposition Bank. So far much of the research has been focused on English due to the lack of semantically annotated resources in other languages. In this paper, we report first results on semantic role labeling using a pre-release version of the Chinese Proposition Bank. Since the Chinese Proposition Bank is superimposed on top of the Chinese Tree-bank, i.e., the semantic role labels are assigned to constituents in a treebank parse tree, we start by reporting results on experiments using the handcrafted parses in the treebank. This will give us a measure of the extent to which the semantic role labels can be bootstrapped from the syntactic annotation in the treebank. We will then report experiments using a fully automatic Chinese parser that integrates word segmentation, POS-tagging and parsing. This will gauge how successful semantic role labeling can be done for Chinese in realistic situations. We show that our results using hand-crafted parses are slightly higher than the results reported for the state-of-the-art semantic role labeling systems for English using the Penn English Proposition Bank data, even though the Chinese Proposition Bank is smaller in size. When an automatic parser is used, however, the accuracy of our system is much lower than the English state-of-the-art. This reveals an interesting cross-linguistic difference between the two languages, which we attempt to explain. We also describe a method to induce verb classes from the Proposition Bank "frame files" that can be used to improve semantic role labeling.