A chinese corpus with word sense annotation

  • Authors:
  • Yunfang Wu;Peng Jin;Yangsen Zhang;Shiwen Yu

  • Affiliations:
  • Institute of Computational Linguistics, Peking University, Beijing, China;Institute of Computational Linguistics, Peking University, Beijing, China;Institute of Computational Linguistics, Peking University, Beijing, China;Institute of Computational Linguistics, Peking University, Beijing, China

  • Venue:
  • ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents the construction of a Chinese word sense-tagged corpus. The resulting lexical resource includes mainly three components: 1) a corpus annotated with word senses; 2) a lexicon containing sense distinction and description in the feature-based formalism; 3) the linking between the sense entries in the lexicon and CCD synsets. A dynamic model is put forward to build the three knowledge bases simultaneously and interactively. The strategy to improve consistency is addressed since consistency is a thorny issue for constructing semantic resources. The inter-annotator agreement of the sense-tagged corpus is satisfied. The database will grow up to be a powerful lexical resource both for linguistic researches on Chinese lexical semantics and word sense disambiguation.