Quality estimation for machine translation output using linguistic analysis and decoding features

  • Authors:
  • Eleftherios Avramidis

  • Affiliations:
  • German Research Center for Artificial Intelligence (DFKI), Berlin, Germany

  • Venue:
  • WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe a submission to the WMT12 Quality Estimation task, including an extensive Machine Learning experimentation. Data were augmented with features from linguistic analysis and statistical features from the SMT search graph. Several Feature Selection algorithms were employed. The Quality Estimation problem was addressed both as a regression task and as a discretised classification task, but the latter did not generalise well on the unseen testset. The most successful regression methods had an RMSE of 0.86 and were trained with a feature set given by Correlation-based Feature Selection. Indications that RMSE is not always sufficient for measuring performance were observed.