Automatic translation error analysis

Authors:
Mark Fishel;Ondřej Bojar;Daniel Zeman;Jan Berka
Affiliations:
Department of Computer Science, University of Tartu, Estonia;Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University in Prague, Czechia;Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University in Prague, Czechia;Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University in Prague, Czechia
Venue:
TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Year:
2011

Citing 12
Cited 1

HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Word to word alignment strategies

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Indirect-HMM-based hypothesis alignment for combining outputs from machine translation systems

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Word error rates: decomposition over Pos classes and applications for error analysis

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Findings of the 2009 workshop on statistical machine translation

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Morpho-syntactic information for automatic error analysis of statistical machine translation output

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Metrics for MT evaluation: evaluating reordering

Machine Translation
Extending the meteor machine translation evaluation metric to the phrase level

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Discriminative modeling of extraction sets for machine translation

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Overcoming statistical machine translation limitations: error analysis and proposed solutions for the Catalan---Spanish language pair

Language Resources and Evaluation
System Combination for Machine Translation of Spoken and Written Language

IEEE Transactions on Audio, Speech, and Language Processing

A graphical interface for MT evaluation and error analysis

ACL '12 Proceedings of the ACL 2012 System Demonstrations

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a method of automatic identification of various error types in machine translation output. The approach is mostly based on monolingual word alignment of the hypothesis and the reference translation. In addition to common lexical errors misplaced words are also detected. A comparison to manually classified MT errors is presented. Our error classification is inspired by that of Vilar (2006; [17]), although distinguishing some of their categories is beyond the reach of the current version of our system.