Analyzing methods for improving precision of pivot based bilingual dictionaries

Authors:
Xabier Saralegi;Iker Manterola;Iñaki San Vicente
Affiliations:
R&D Elhuyar Foundation, Usurbil, Basque Country;R&D Elhuyar Foundation, Usurbil, Basque Country;R&D Elhuyar Foundation, Usurbil, Basque Country
Venue:
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Year:
2011

Citing 8
Cited 1

Construction of a bilingual dictionary intermediated by a third language

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Automatic identification of word translations from unrelated English and German corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Compiling a massive, multilingual dictionary via probabilistic inference

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Bilingual dictionary generation for low-resourced language pairs

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Automatic construction of a transfer dictionary considering directionality

MLR '04 Proceedings of the Workshop on Multilingual Linguistic Ressources
An approach for extracting bilingual terminology from Wikipedia

DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Bilingual lexicon generation using non-aligned signatures

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Automatic generation of bilingual dictionaries using intermediary languages and comparable corpora

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing

Regularized interlingual projections: evaluation on multilingual transliteration

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

An A-C bilingual dictionary can be inferred by merging A-B and B-C dictionaries using B as pivot. However, polysemous pivot words often produce wrong translation candidates. This paper analyzes two methods for pruning wrong candidates: one based on exploiting the structure of the source dictionaries, and the other based on distributional similarity computed from comparable corpora. As both methods depend exclusively on easily available resources, they are well suited to less resourced languages. We studied whether these two techniques complement each other given that they are based on different paradigms. We also researched combining them by looking for the best adequacy depending on various application scenarios.