Combining heterogeneous knowledge resources for improved distributional semantic models

  • Authors:
  • György Szarvas;Torsten Zesch;Iryna Gurevych

  • Affiliations:
  • Ubiquitous Knowledge Processing Lab, Computer Science Department, Technische Universität Darmstadt, Darmstadt, Germany;Ubiquitous Knowledge Processing Lab, Computer Science Department, Technische Universität Darmstadt, Darmstadt, Germany;Ubiquitous Knowledge Processing Lab, Computer Science Department, Technische Universität Darmstadt, Darmstadt, Germany

  • Venue:
  • CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Explicit Semantic Analysis (ESA) model based on term cooccurrences in Wikipedia has been regarded as state-of-the-art semantic relatedness measure in the recent years. We provide an analysis of the important parameters of ESA using datasets in five different languages. Additionally, we propose the use of ESA with multiple lexical semantic resources thus exploiting multiple evidence of term cooccurrence to improve over the Wikipedia-based measure. Exploiting the improved robustness and coverage of the proposed combination, we report improved performance over single resources in word semantic relatedness, solving word choice problems, classification of semantic relations between nominals, and text similarity.