Semantic annotation of a Japanese speech corpus

  • Authors:
  • John Fry;Francis Bond

  • Affiliations:
  • Stanford University, Stanford CA;NTT Communication Science Laboratories, Kyoto, Japan

  • Venue:
  • Proceedings of the COLING-2000 Workshop on Semantic Annotation and Intelligent Content
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the semantic annotations we are performing on the CallHome Japanese corpus of spontaneous, unscripted telephone conversations (LDC, 1996). Our annotations include (i) semantic classes for all nouns and verbs; (ii) verb senses for all main verbs; and (iii) relations between main verbs and their complements in the same utterance. Our semantic tagset is taken from NTT's Goi-Taikei semantic lexicon and ontology (Ikehara et al., 1997). A pilot study demonstrates that the verb sense tagging can be efficiently performed by native Japanese speakers using computer-generated HTML forms, and that good inter-annotator reliability can be obtained in the right conditions.