Scaling up WSD with automatically generated examples

  • Authors:
  • Weiwei Cheng;Judita Preiss;Mark Stevenson

  • Affiliations:
  • Sheffield University, Regent Court, Portobello, Sheffield, United Kingdom;Sheffield University, Regent Court, Portobello, Sheffield, United Kingdom;Sheffield University, Regent Court, Portobello, Sheffield, United Kingdom

  • Venue:
  • BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The most accurate approaches to Word Sense Disambiguation (WSD) for biomedical documents are based on supervised learning. However, these require manually labeled training examples which are expensive to create and consequently supervised WSD systems are normally limited to disambiguating a small set of ambiguous terms. An alternative approach is to create labeled training examples automatically and use them as a substitute for manually labeled ones. This paper describes a large scale WSD system based on automatically labeled examples generated using information from the UMLS Metathesaurus. The labeled examples are generated without any use of labeled training data whatsoever and is therefore completely unsupervised (unlike some previous approaches). The system is evaluated on two widely used data sets and found to outperform a state-of-the-art unsupervised approach which also uses information from the UMLS Metathesaurus.