Phrasal translation and query expansion techniques for cross-language information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
A systematic comparison of various statistical alignment models
Computational Linguistics
An IR approach for translating new words from nonparallel, comparable texts
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Automatic identification of word translations from unrelated English and German corpora
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
An approach based on multilingual thesauri and model combination for bilingual lexicon extraction
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A geometric view on bilingual lexicon extraction from comparable corpora
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Extracting parallel sub-sentential fragments from non-parallel corpora
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Extracting bilingual dictionary from comparable corpora with dependency heterogeneity
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Bilingual lexicon generation using non-aligned signatures
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Revisiting context-based projection methods for term-translation spotting in comparable corpora
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Improving corpus comparability for bilingual lexicon extraction from comparable corpora
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Detecting highly confident word translations from comparable corpora without any prior knowledge
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
BiCWS: mining cognitive differences from bilingual web search results
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
A language modeling approach for extracting translation knowledge from comparable corpora
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
We study in this paper the problem of enhancing the comparability of bilingual corpora in order to improve the quality of bilingual lexicons extracted from comparable corpora. We introduce a clustering-based approach for enhancing corpus comparability which exploits the homogeneity feature of the corpus, and finally preserves most of the vocabulary of the original corpus. Our experiments illustrate the well-foundedness of this method and show that the bilingual lexicons obtained from the homogeneous corpus are of better quality than the lexicons obtained with previous approaches.