Dimensionality reduction methods for machine translation quality estimation

Authors:
Jesús González-Rubio;J. Ramón Navarro-Cerdán;Francisco Casacuberta
Affiliations:
Universitat Politècnica de València, Valencia, Spain;Instituto Tecnológico de Informática, Valencia, Spain;Universitat Politècnica de València, Valencia, Spain
Venue:
Machine Translation
Year:
2013

Citing 16
Cited 0

Support-Vector Networks

Machine Learning
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems

Theoretical Computer Science
Using analytic QP and sparseness to speed training of support vector machines

Proceedings of the 1998 conference on Advances in neural information processing systems II
An introduction to variable and feature selection

The Journal of Machine Learning Research
Confidence estimation for translation prediction

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Confidence estimation for machine translation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Word-Level Confidence Estimation for Machine Translation

Computational Linguistics
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
TrustRank: inducing trust in automatic translations via ranking

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Balancing user effort and translation error in interactive machine translation via confidence measures

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Findings of the 2012 workshop on statistical machine translation

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Quality estimation for machine translation output using linguistic analysis and decoding features

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
PRHLT submission to the WMT12 quality estimation task

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
The SDL language weaver systems in the WMT12 quality estimation shared task

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Quality estimation (QE) for machine translation is usually addressed as a regression problem where a learning model is used to predict a quality score from a (usually highly-redundant) set of features that represent the translation. This redundancy hinders model learning, and thus penalizes the performance of quality estimation systems. We propose different dimensionality reduction methods based on partial least squares regression to overcome this problem, and compare them against several reduction methods previously used in the QE literature. Moreover, we study how the use of such methods influence the performance of different learning models. Experiments carried out on the English-Spanish WMT12 QE task showed that it is possible to improve prediction accuracy while significantly reducing the size of the feature sets.