The Johns Hopkins SENSEVAL2 system descriptions

  • Authors:
  • David Yarowsky;Silviu Cucerzan;Radu Florian;Charles Schafer;Richard Wicentowski

  • Affiliations:
  • Johns Hopkins University, Baltimore, Maryland;Johns Hopkins University, Baltimore, Maryland;Johns Hopkins University, Baltimore, Maryland;Johns Hopkins University, Baltimore, Maryland;Johns Hopkins University, Baltimore, Maryland

  • Venue:
  • SENSEVAL '01 The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article describes the Johns Hopkins University (JHU) sense disambiguation systems that participated in seven SENSEVAL2 tasks: four supervised lexical choice systems (Basque, English, Spanish, Swedish), one unsupervised lexical choice system (Italian) and two supervised all-words systems (Czech, Estonian). The common core supervised system utilizes voting-based classifier combination over several diverse systems, including decision lists (Yarowsky, 2000), a cosine-based vector model and two Bayesian classifiers. The classifiers employed a rich set of features, including words, lemmas and part-of-speech informatino modeled in several syntactic relationships (e.g. verb-object), bag-of-words context and local collocational n-grams. The all-words systems relied heavily on morphological analysis in the two highly inflected languages. The unsupervised Italian system was a hierarchical class model using the Italian WordNet.