Semi-automatic enrichment of crowdsourced synonymy networks: the WISIGOTH system applied to Wiktionary

  • Authors:
  • Franck Sajous;Emmanuel Navarro;Bruno Gaume;Laurent Prévot;Yannick Chudy

  • Affiliations:
  • CLLE-ERSS, CNRS & Universitéé de Toulouse, Toulouse, France;IRIT, CNRS & Universitéé de Toulouse, Toulouse, France;CLLE-ERSS, CNRS & Universitéé de Toulouse, Toulouse, France;LPL, CNRS & Universitéé de Provence, Provence, France;CLLE-ERSS, CNRS & Universitéé de Toulouse, Toulouse, France

  • Venue:
  • Language Resources and Evaluation
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Semantic lexical resources are a mainstay of various Natural Language Processing applications. However, comprehensive and reliable resources are rare and not often freely available. Handcrafted resources are too costly for being a general solution while automatically-built resources need to be validated by experts or at least thoroughly evaluated. We propose in this paper a picture of the current situation with regard to lexical resources, their building and their evaluation. We give an in-depth description of Wiktionary, a freely available and collaboratively built multilingual dictionary. Wiktionary is presented here as a promising raw resource for NLP. We propose a semi-automatic approach based on random walks for enriching Wiktionary synonymy network that uses both endogenous and exogenous data. We take advantage of the wiki infrastructure to propose a validation "by crowds". Finally, we present an implementation called WISIGOTH, which supports our approach.