Dynamic translation memory: using statistical machine translation to improve translation memory fuzzy matches

Authors:
Ergun Biçici;Marc Dymetman
Affiliations:
Koç University, Istanbul, Turkey;Xerox Research Centre Europe, Grenoble, France
Venue:
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Year:
2008

Citing 10
Cited 3

Merging Example-Based and Statistical Machine Translation: An Experiment

AMTA '02 Proceedings of the 5th Conference of the Association for Machine Translation in the Americas on Machine Translation: From Research to Real Users
A systematic comparison of various statistical alignment models

Computational Linguistics
Models of translational equivalence among words

Computational Linguistics
Towards a unified approach to memory- and statistical-based machine translation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
Aligning words using matrix factorisation

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Translating with non-contiguous phrases

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Hybrid data-driven models of machine translation

Machine Translation
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

HLT '02 Proceedings of the second international conference on Human Language Technology Research

An effective approach for searching closest sentence translations from the web

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications: Part II
Consistent translation using discriminative learning: a translation memory-inspired approach

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Generalizing sampling-based multilingual alignment

Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Professional translators of technical documents often use Translation Memory (TM) systems in order to capitalize on the repetitions frequently observed in these documents. TM systems typically exploit not only complete matches between the source sentence to be translated and some previously translated sentence, but also so-called fuzzy matches, where the source sentence has some substantial commonality with a previously translated sentence. These fuzzy matches can be very worthwhile as a starting point for the human translator, but the translator then needs to manually edit the associated TM-based translation to accommodate the differences with the source sentence to be translated. If part of this process could be automated, the cost of human translation could be significantly reduced. The paper proposes to perform this automation in the following way: a phrase-based Statistical Machine Translation (SMT) system (trained on a bilingual corpus in the same domain as the TM) is combined with the TM fuzzy match, by extracting from the fuzzy-match a large (possibly gapped) bi-phrase that is dynamically added to the usual set of "static" bi-phrases used for decoding the source. We report experiments that show significant improvements in terms of BLEU and NIST scores over both the translations produced by the stand-alone SMT system and the fuzzy-match translations proposed by the stand-alone TM system.