A statistical approach to machine translation
Computational Linguistics
Machine Learning
An introduction to variable and feature selection
The Journal of Machine Learning Research
Automated scoring using a hybrid feature identification technique
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Confidence estimation for machine translation
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A Branch and Bound Algorithm for Feature Subset Selection
IEEE Transactions on Computers
Dataset Shift in Machine Learning
Dataset Shift in Machine Learning
Inter-coder agreement for computational linguistics
Computational Linguistics
Predicting the readability of short web summaries
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Manual and automatic evaluation of machine translation between European languages
StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Statistical Machine Translation
Statistical Machine Translation
Machine translation evaluation versus quality estimation
Machine Translation
Error detection for statistical machine translation using linguistic features
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
TrustRank: inducing trust in automatic translations via ranking
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Goodness: a method for measuring machine translation confidence
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Scikit-learn: Machine Learning in Python
The Journal of Machine Learning Research
Evaluation without references: IBM1 scores as evaluation metrics
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Morphemes and POS tags for n-gram based evaluation metrics
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Findings of the 2012 workshop on statistical machine translation
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Linguistic features for quality estimation
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Morpheme- and POS-based IBM1 scores and language model scores for translation quality estimation
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
DCU-symantec submission for the WMT 2012 quality estimation task
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
The SDL language weaver systems in the WMT12 quality estimation shared task
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Non-linear models for confidence estimation
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Comparing human perceptions of post-editing effort with post-editing operations
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Hi-index | 0.00 |
The dissemination of statistical machine translation (SMT) systems in the professional translation industry is still limited by the lack of reliability of SMT outputs, the quality of which varies to a great extent. A critical piece of information would be for MT systems to automatically assess their output translations with automatically derived quality measures. Predicting quality measures was indeed the goal of a shared task at the Workshop on SMT in 2012. In this contribution, we first report our results for this shared task, detailing the features that we found to be the most predictive of quality. In the latter part, we reexamine the shared task data and protocol and show that several factors actually contributed to the difficulty of the task, and discuss alternative evaluation designs.