An unsupervised method for multilingual word sense tagging using parallel corpora: a preliminary investigation

  • Authors:
  • Mona Diab

  • Affiliations:
  • University of Maryland, MD

  • Venue:
  • WWSM '00 Proceedings of the ACL-2000 workshop on Word senses and multi-linguality - Volume 8
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

With an increasing number of languages making their way to our desktops everyday via the Internet, researchers have come to realize the lack of linguistic knowledge resources for scarcely represented/studied languages. In an attempt to bootstrap some of the required linguistic resources for some of those languages, this paper presents an unsupervised method for automatic multilingual word sense tagging using parallel corpora. The method is evaluated on the English Brown corpus and its translation into three different languages: French, German and Spanish. A preliminary evaluation of the proposed method yielded results of up to 79% accuracy rate for the English data on 81.8% of the SemCor manually tagged data.