Scaling up word sense disambiguation via parallel texts

  • Authors:
  • Yee Seng Chan;Hwee Tou Ng

  • Affiliations:
  • Department of Computer Science, National University of Singapore, Singapore;Department of Computer Science, National University of Singapore, Singapore

  • Venue:
  • AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

A critical porblem faced by current supervised WSD systems is the lack or manually annotated training data. Tackling this data acquisition bottleneck is crucial, in order to build high-accuracy and wide-coverage WSD systems. In this paper, we show that the approach of automatically gathering training examples from parallel texts is scalable to a large set of nouns. We conducted evaluation on the nouns of SENSEVAL-2 English all-words task, using fine-grained sense scoring. Our evaluation shows that training on examples gathered from 680MB of parallel texts achieves accuracy comparable to the best system of SENSEVAL-2 English all-words task, and significantly outperforms the baseline of always choosing sense 1 of WordNet.