An approach to acquire word translations from non-parallel texts

Authors:
Pablo Gamallo Otero;José Ramom Pichel Campos
Affiliations:
Department de Língua Espanhola, Faculdade de Filologia, Universidade de Santiago de Compostela, Galiza, Spain;Department de Tecnologia Linguística da Imaxin, Software, Santiago de Compostela, Galiza
Venue:
EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence
Year:
2005

Citing 10
Cited 2

Explorations in Automatic Thesaurus Discovery

Explorations in Automatic Thesaurus Discovery
Syntactic-Based Methods for Measuring Word Similarity

TSD '01 Proceedings of the 4th International Conference on Text, Speech and Dialogue
A word-to-word model of translational equivalence

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
A simple hybrid aligner for generating lexical correspondences in parallel texts

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
An IR approach for translating new words from nonparallel, comparable texts

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Identifying word translations in non-parallel texts

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Automatic identification of word translations from unrelated English and German corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
An approach based on multilingual thesauri and model combination for bilingual lexicon extraction

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Clustering Syntactic Positions with Similar Semantic Requirements

Computational Linguistics

A graph-theoretic algorithm for automatic extension of translation lexicons

GEMS '09 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics
Bootstrapping bilingual lexicons from comparable corpora for closely related languages

TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue

Quantified Score

Hi-index	0.00

Visualization

Abstract

Few approaches to extract word translations from non-parallel texts have been proposed so far. Researchers have not been encouraged to work on this topic because extracting information from non-parallel corpora is a difficult task producing poor results. Whereas for parallel texts, word translation extraction can reach about 99%, the accuracy for non-parallel texts has been around 72% up to now. The current approach, which relies on the previous extraction of bilingual pairs of lexico-syntactic templates from parallel corpora, makes a significant improvement to about 89% of words translations identified correctly.