Semi-automatic endogenous enrichment of collaboratively constructed lexical resources: piggybacking onto wiktionary

Authors:
Franck Sajous;Emmanuel Navarro;Bruno Gaume;Laurent Prévot;Yannick Chudy
Affiliations:
CLLE-ERSS, CNRS & Université de Toulouse;IRIT, CNRS & Université de Toulouse;CLLE-ERSS, CNRS & Université de Toulouse;LPL, CNRS & Université de Toulouse;CLLE-ERSS, CNRS & Université de Toulouse
Venue:
IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
Year:
2010

Citing 9
Cited 2

EuroWordNet: a multilingual database with lexical semantic networks

EuroWordNet: a multilingual database with lexical semantic networks
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Espresso: leveraging generic patterns for automatically harvesting semantic relations

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Scaling Consensus: Increasing Decentralization in Wikipedia Governance

HICSS '08 Proceedings of the Proceedings of the 41st Annual Hawaii International Conference on System Sciences
French EuroWordNet Lexical Database Improvements

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Wiktionary and NLP: improving synonymy networks

People's Web '09 Proceedings of the 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources
Using the Wiktionary graph structure for synonym detection

People's Web '09 Proceedings of the 2009 Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources
Wisdom of crowds versus wisdom of linguists – measuring the semantic relatedness of words

Natural Language Engineering
Worth its weight in gold or yet another resource — a comparative study of wiktionary, openthesaurus and germanet

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing

Invariants and variability of synonymy networks: self mediated agreement by confluence

TextGraphs-6 Proceedings of TextGraphs-6: Graph-based Methods for Natural Language Processing
Semi-automatic enrichment of crowdsourced synonymy networks: the WISIGOTH system applied to Wiktionary

Language Resources and Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

The lack of large-scale, freely available and durable lexical resources, and the consequences for NLP, is widely acknowledged but the attempts to cope with usual bottlenecks preventing their development often result in dead-ends. This article introduces a language-independent, semi-automatic and endogenous method for enriching lexical resources, based on collaborative editing and random walks through existing lexical relationships, and shows how this approach enables us to overcome recurrent impediments. It compares the impact of using different data sources and similarity measures on the task of improving synonymy networks. Finally, it defines an architecture for applying the presented method to Wiktionary and explains how it has been implemented.