Annotating a Japanese text corpus with predicate-argument and coreference relations

  • Authors:
  • Ryu Iida;Mamoru Komachi;Kentaro Inui;Yuji Matsumoto

  • Affiliations:
  • Nara Institute of Science and Technology, Ikoma, Nara, Japan;Nara Institute of Science and Technology, Ikoma, Nara, Japan;Nara Institute of Science and Technology, Ikoma, Nara, Japan;Nara Institute of Science and Technology, Ikoma, Nara, Japan

  • Venue:
  • LAW '07 Proceedings of the Linguistic Annotation Workshop
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we discuss how to annotate coreference and predicate-argument relations in Japanese written text. There have been research activities for building Japanese text corpora annotated with coreference and predicate-argument relations as are done in the Kyoto Text Corpus version 4.0 (Kawahara et al., 2002) and the GDA-Tagged Corpus (Hasida, 2005). However, there is still much room for refining their specifications. For this reason, we discuss issues in annotating these two types of relations, and propose a new specification for each. In accordance with the specification, we built a large-scaled annotated corpus, and examined its reliability. As a result of our current work, we have released an annotated corpus named the NAIST Text Corpus1, which is used as the evaluation data set in the coreference and zero-anaphora resolution tasks in Iida et al. (2005) and Iida et al. (2006).