New experiments in distributional representations of synonymy

  • Authors:
  • Dayne Freitag;Matthias Blume;John Byrnes;Edmond Chow;Sadik Kapadia;Richard Rohwer;Zhiqiang Wang

  • Affiliations:
  • HNC Software, LLC, San Diego, CA;HNC Software, LLC, San Diego, CA;HNC Software, LLC, San Diego, CA;HNC Software, LLC, San Diego, CA;HNC Software, LLC, San Diego, CA;HNC Software, LLC, San Diego, CA;HNC Software, LLC, San Diego, CA

  • Venue:
  • CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent work on the problem of detecting synonymy through corpus analysis has used the Test of English as a Foreign Language (TOEFL) as a benchmark. However, this test involves as few as 80 questions, prompting questions regarding the statistical significance of reported results. We overcome this limitation by generating a TOEFL-like test using WordNet, containing thousands of questions and composed only of words occurring with sufficient corpus frequency to support sound distributional comparisons. Experiments with this test lead us to a similarity measure which significantly outperforms the best proposed to date. Analysis suggests that a strength of this measure is its relative robustness against polysemy.