Translationese and its dialects

Authors:
Moshe Koppel;Noam Ordan
Affiliations:
Bar Ilan University, Ramat-Gan, Israel;University of Haifa, Haifa, Israel
Venue:
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Year:
2011

Citing 3
Cited 6

Source language markers in EUROPARL translations

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Evaluation of several phonetic similarity algorithms on the task of cognate identification

LD '06 Proceedings of the Workshop on Linguistic Distances
Identification of translationese: a machine learning approach

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing

Searching for poor quality machine translated text: learning the difference between human writing and machine translations

Canadian AI'12 Proceedings of the 25th Canadian conference on Advances in Artificial Intelligence
Adapting translation models to translationese improves SMT

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Discourse structure and language technology

Natural Language Engineering
Language models for machine translation: Original vs. translated texts

Computational Linguistics
Improving statistical machine translation by adapting translation models to translationese

Computational Linguistics
Improving statistical machine translation by adapting translation models to translationese

Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

While it is has often been observed that the product of translation is somehow different than non-translated text, scholars have emphasized two distinct bases for such differences. Some have noted interference from the source language spilling over into translation in a source-language-specific way, while others have noted general effects of the process of translation that are independent of source language. Using a series of text categorization experiments, we show that both these effects exist and that, moreover, there is a continuum between them. There are many effects of translation that are consistent among texts translated from a given source language, some of which are consistent even among texts translated from families of source languages. Significantly, we find that even for widely unrelated source languages and multiple genres, differences between translated texts and non-translated texts are sufficient for a learned classifier to accurately determine if a given text is translated or original.