Information-Based Evaluation Criterion for Classifier's Performance
Machine Learning
Fundamentals of neural networks: architectures, algorithms, and applications
Fundamentals of neural networks: architectures, algorithms, and applications
Machine Learning
WordNet: a lexical database for English
Communications of the ACM
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
A tutorial on support vector regression
Statistics and Computing
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Confidence estimation for translation prediction
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Confidence estimation for machine translation
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Word-level confidence estimation for machine translation using phrase-based translation models
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Rule-based translation with statistical phrase-based post-editing
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Findings of the 2012 workshop on statistical machine translation
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Black box features for the WMT 2012 quality estimation shared task
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
LORIA system for the WMT12 quality estimation shared task
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Hi-index | 0.00 |
Machine translation systems are not reliable enough to be used "as is": except for the most simple tasks, they can only be used to grasp the general meaning of a text or assist human translators. The purpose of confidence measures is to detect erroneous words or sentences produced by a machine translation system. In this article, after reviewing the mathematical foundations of confidence estimation, we propose a comparison of several state-of-the-art confidence measures, predictive parameters and classifiers. We also propose two original confidence measures based on Mutual Information and a method for automatically generating data for training and testing classifiers. We applied these techniques to data from the WMT campaign 2008 and found that the best confidence measures yielded an Equal Error Rate of 36.3% at word level and 34.2% at sentence level, but combining different measures reduced these rates to 35.0% and 29.0%, respectively. We also present the results of an experiment aimed at determining how helpful confidence measures are in a post-editing task. Preliminary results suggest that our system is not yet ready to efficiently help post-editors, but we now have both software and a protocol that we can apply to further experiments, and user feedback has indicated aspects which must be improved in order to increase the level of helpfulness of confidence measures.