Evaluation of 2-way Iraqi Arabic---English speech translation systems using automated metrics

  • Authors:
  • Sherri Condon;Mark Arehart;Dan Parvaz;Gregory Sanders;Christy Doran;John Aberdeen

  • Affiliations:
  • The MITRE Corporation, McLean, USA;The MITRE Corporation, McLean, USA;The MITRE Corporation, Orlando, USA;National Institute of Standards and Technology, Gaithersburg, USA;The MITRE Corporation, Bedford, USA;The MITRE Corporation, Bedford, USA

  • Venue:
  • Machine Translation
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Defense Advanced Research Projects Agency (DARPA) Spoken Language Communication and Translation System for Tactical Use (TRANSTAC) program ( http://1.usa.gov/transtac ) faced many challenges in applying automated measures of translation quality to Iraqi Arabic---English speech translation dialogues. Features of speech data in general and of Iraqi Arabic data in particular undermine basic assumptions of automated measures that depend on matching system outputs to reference translations. These features are described along with the challenges they present for evaluating machine translation quality using automated metrics. We show that scores for translation into Iraqi Arabic exhibit higher correlations with human judgments when they are computed from normalized system outputs and reference translations. Orthographic normalization, lexical normalization, and operations involving light stemming resulted in higher correlations with human judgments.