Graph connectivity measures for unsupervised parameter tuning of graph-based sense induction systems

  • Authors:
  • Ioannis Korkontzelos;Ioannis Klapaftis;Suresh Manandhar

  • Affiliations:
  • The University of York, York, UK;The University of York, York, UK;The University of York, York, UK

  • Venue:
  • UMSLLS '09 Proceedings of the Workshop on Unsupervised and Minimally Supervised Learning of Lexical Semantics
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Word Sense Induction (WSI) is the task of identifying the different senses (uses) of a target word in a given text. This paper focuses on the unsupervised estimation of the free parameters of a graph-based WSI method, and explores the use of eight Graph Connectivity Measures (GCM) that assess the degree of connectivity in a graph. Given a target word and a set of parameters, GCM evaluate the connectivity of the produced clusters, which correspond to subgraphs of the initial (unclustered) graph. Each parameter setting is assigned a score according to one of the GCM and the highest scoring setting is then selected. Our evaluation on the nouns of SemEval-2007 WSI task (SWSI) shows that: (1) all GCM estimate a set of parameters which significantly outperform the worst performing parameter setting in both SWSI evaluation schemes, (2) all GCM estimate a set of parameters which outperform the Most Frequent Sense (MFS) baseline by a statistically significant amount in the supervised evaluation scheme, and (3) two of the measures estimate a set of parameters that performs closely to a set of parameters estimated in supervised manner.