Word-Level Confidence Estimation for Machine Translation

Authors:
Nicola Ueffing;Hermann Ney
Affiliations:
-;-
Venue:
Computational Linguistics
Year:
2007

Citing 15
Cited 23

The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
Generation of word graphs in statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Confidence estimation for translation prediction

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A projection extension algorithm for statistical machine translation

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Speech and Language Processing (2nd Edition)

Speech and Language Processing (2nd Edition)
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Confidence estimation for machine translation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Using a mixture of N-best lists from multiple MT systems in rank-sum-based confidence measure for MT outputs

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Word-level confidence estimation for machine translation using phrase-based translation models

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Word graphs for statistical machine translation

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
N-gram posterior probabilities for statistical machine translation

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation

Semi-supervised model adaptation for statistical machine translation

Machine Translation
Two-Stage Hypotheses Generation for Spoken Language Translation

ACM Transactions on Asian Language Information Processing (TALIP)
A statistical approach to crosslingual natural language tasks

Journal of Algorithms
Computing confidence scores for all sub parse trees

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Automatic selection of high quality parses created by a fully unsupervised parser

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Active learning for statistical phrase-based machine translation

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
NRC's PORTAGE system for WMT 2007

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Error detection for statistical machine translation using linguistic features

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Balancing user effort and translation error in interactive machine translation via confidence measures

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Active learning-based elicitation for semi-supervised word alignment

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Active semi-supervised learning for improving word alignment

ALNLP '10 Proceedings of the NAACL HLT 2010 Workshop on Active Learning for Natural Language Processing
Confidence in structured-prediction using confidence-weighted models

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Fluency constraints for minimum Bayes-risk decoding of statistical machine translation lattices

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Globally distributed content creation: developing consumable content for international markets

Proceedings of the 28th ACM International Conference on Design of Communication
Confidence measures for error discrimination in an interactive predictive parsing framework

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Goodness: a method for measuring machine translation confidence

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Confidence driven unsupervised semantic parsing

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
An active learning scenario for interactive machine translation

ICMI '11 Proceedings of the 13th international conference on multimodal interfaces
Active learning for interactive machine translation

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Quality estimation for machine translation output using linguistic analysis and decoding features

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
N-gram posterior probability confidence measures for statistical machine translation: an empirical study

Machine Translation
Dimensionality reduction methods for machine translation quality estimation

Machine Translation
Cost-sensitive active learning for computer-assisted translation

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article introduces and evaluates several different word-level confidence measures for machine translation. These measures provide a method for labeling each word in an automatically generated translation as correct or incorrect. All approaches to confidence estimation presented here are based on word posterior probabilities. Different concepts of word posterior probabilities as well as different ways of calculating them will be introduced and compared. They can be divided into two categories: System-based methods that explore knowledge provided by the translation system that generated the translations, and direct methods that are independent of the translation system. The system-based techniques make use of system output, such as word graphs or N-best lists. The word posterior probability is determined by summing the probabilities of the sentences in the translation hypothesis space that contains the target word. The direct confidence measures take other knowledge sources, such as word or phrase lexica, into account. They can be applied to output from nonstatistical machine translation systems as well. Experimental assessment of the different confidence measures on various translation tasks and in several language pairs will be presented. Moreover,the application of confidence measures for rescoring of translation hypotheses will be investigated.