Measuring user acceptability of machine translations to diagnose system errors: an experience report

  • Authors:
  • Bowen Hui

  • Affiliations:
  • University of Toronto, Canada

  • Venue:
  • COLING-MTIA '02 Proceedings of the 2002 COLING workshop on Machine translation in Asia - Volume 16
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Conventional ways of measuring machine translation quality compares the accuracy of system output without clearly specifying what "accuracy" entails. Many current evaluation methods suffer from requiring too much time commitment from expert human evaluators. Moreover, these methods do not give direct feedback on user acceptability of the system, and do not hint on areas of focus for researchers or developers. In this work, we explore an output inspection method that measures user acceptance and pokes at system errors so that developers and researchers can walk away knowing what was acceptable and what to improve on. The evaluation framework for machine translation is described and experimental results for two systems are presented. The results of the experiments are very encouraging. We provide a discussion on identifying important translation quality factors for users, a pilot study of running this evaluation in the text summarization domain, and ideas on how to use the gathered data to create user profiles.