Word sense disambiguation with pattern learning and automatic feature selection

  • Authors:
  • Rada F. Mihalcea

  • Affiliations:
  • Department of Computer Science, University of North Texas, Denton, TX 76203-1366, USA e-mail: rada@cs.unt.edu

  • Venue:
  • Natural Language Engineering
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a novel approach for word sense disambiguation. The underlying algorithm has two main components: (1) pattern learning from available sense-tagged corpora (SemCor), from dictionary definitions (WordNet) and from a generated corpus (GenCor); and (2) instance based learning with automatic feature selection, when training data is available for a particular word. The ideas described in this paper were implemented in a system that achieves excellent performance on the data provided during the SENSEVAL-2 evaluation exercise, for both English all words and English lexical sample tasks.