BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Journal of Field Robotics
The evolution of performance metrics in the RoboCup Rescue Virtual Robot Competition
PerMIS '07 Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems
PerMIS '08 Proceedings of the 8th Workshop on Performance Metrics for Intelligent Systems
PerMIS '09 Proceedings of the 9th Workshop on Performance Metrics for Intelligent Systems
Proceedings of the 10th Performance Metrics for Intelligent Systems Workshop
Lessons learned in evaluating DARPA advanced military technologies
Proceedings of the 10th Performance Metrics for Intelligent Systems Workshop
Computer Speech and Language
Computer Speech and Language
Hi-index | 0.00 |
The Spoken Language Communication and Translation System for Tactical Use (TRANSTAC) program is a Defense Advanced Research Projects Agency (DARPA) advanced technology research and development program. The goal of the TRANSTAC program is to demonstrate capabilities to rapidly develop and field free-form, two-way translation systems that enable speakers of different languages to communicate with one another in realworld tactical situations without an interpreter. The National Institute of Standards and Technology (NIST), along with support from MITRE and Appen Pty Ltd., have been funded to serve as the Independent Evaluation Team (IET) for the TRANSTAC Program. The IET is responsible for analyzing the performance of the TRANSTAC systems by designing and executing multiple TRANSTAC evaluations and analyzing the results of the evaluation. To accomplish this, NIST has applied the SCORE (System, Component, and Operationally Relevant Evaluations) Framework. SCORE is a unified set of criteria and software tools for defining a performance evaluation approach for complex intelligent systems. It provides a comprehensive evaluation blueprint that assesses the technical performance of a system and its components through isolating variables as well as capturing end-user utility of the system in realistic use-case environments. This document describes the TRANSTAC program and explains how the SCORE framework was applied to assess the technical and utility performance of the TRANSTAC systems.