Translating collocations for bilingual lexicons: a statistical approach
Computational Linguistics
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora
Computational Linguistics
An algorithm for finding noun phrase correspondences in bilingual corpora
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Acquisition of phrase-level bilingual correspondence using dependency structure
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Synonymous collocation extraction using translation information
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Improved statistical alignment models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Improving word alignment using syntactic dependencies
SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
Task-based evaluation of multiword expressions: a pilot study in statistical machine translation
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
DOMCAT: a bilingual concordancer for domain-specific computer assisted translation
ACL '12 Proceedings of the ACL 2012 System Demonstrations
Hi-index | 0.00 |
In this paper, we present an algorithm for extracting translations of any given multiword expression from parallel corpora. Given a multiword expression to be translated, the method involves extracting a short list of target candidate words from parallel corpora based on scores of normalized frequency, generating possible translations and filtering out common subsequences, and selecting the top-n possible translations using the Dice coefficient. Experiments show that our approach outperforms the word alignment-based and other naive association-based methods. We also demonstrate that adopting the extracted translations can significantly improve the performance of the Moses machine translation system.