Investigating the contribution of linguistic information to quality estimation

  • Authors:
  • Mariano Felice;Lucia Specia

  • Affiliations:
  • Computer Laboratory, University of Cambridge, Cambridge, UK CB3 0FD;Department of Computer Science, University of Sheffield, Sheffield, UK S1 4DP

  • Venue:
  • Machine Translation
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a study on the contribution of linguistically-informed features to the task of quality estimation for machine translation at sentence level. A standard regression algorithm is used to build models using a combination of linguistic and non-linguistic features extracted from the input text and its machine translation. Experiments with three English---Spanish translation datasets show that linguistic features on their own are not able to outperform shallower features based on statistics from the input text, its translation and additional corpora. However, further analysis suggests that linguistic information can be useful to produce better results if carefully combined with other features. An in-depth analysis of the results highlights a number of issues related to the use of linguistic features.