A linguistically grounded graph model for bilingual lexicon extraction

Authors:
Florian Laws;Lukas Michelbacher;Beate Dorow;Christian Scheible;Ulrich Heid;Hinrich Schütze
Affiliations:
Universität Stuttgart;Universität Stuttgart;Universität Stuttgart;Universität Stuttgart;Universität Stuttgart;Universität Stuttgart
Venue:
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Year:
2010

Citing 9
Cited 4

SimRank: a measure of structural-context similarity

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
An IR approach for translating new words from nonparallel, comparable texts

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Automatic identification of word translations from unrelated English and German corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Learning a translation lexicon from monolingual corpora

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
Efficient parsing of highly ambiguous context-free grammars with bit vectors

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Accuracy estimate and optimization techniques for SimRank computation

Proceedings of the VLDB Endowment
Improving translation lexicon induction from monolingual corpora via dependency contexts and part-of-speech equivalences

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
A graph-theoretic algorithm for automatic extension of translation lexicons

GEMS '09 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics

Sentiment translation through multi-edge graphs

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Learning the optimal use of dependency-parsing information for finding translations with comparable corpora

BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
Statistical Extraction and Comparison of Pivot Words for Bilingual Lexicon Extension

ACM Transactions on Asian Language Information Processing (TALIP)
Bilingual lexicon extraction from comparable corpora using label propagation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a new method, based on graph theory, for bilingual lexicon extraction without relying on resources with limited availability like parallel corpora. The graphs we use represent linguistic relations between words such as adjectival modification. We experiment with a number of ways of combining different linguistic relations and present a novel method, multi-edge extraction (MEE), that is both modular and scalable. We evaluate MEE on adjectives, verbs and nouns and show that it is superior to cooccurrence-based extraction (which does not use linguistic analysis). Finally, we publish a reproducible baseline to establish an evaluation benchmark for bilingual lexicon extraction.