Subject-based evaluation measures for interactive spoken language systems

  • Authors:
  • Patti Price;Lynette Hirschman;Elizabeth Shriberg;Elizabeth Wade

  • Affiliations:
  • SRI International, Menlo Park, CA;MIT Laboratory for Computer Science, Cambridge, MA;University of California at Berkeley, Berkeley, CA;Stanford University, Stanford, CA

  • Venue:
  • HLT '91 Proceedings of the workshop on Speech and Natural Language
  • Year:
  • 1992

Quantified Score

Hi-index 0.01

Visualization

Abstract

The DARPA Spoken Language effort has profited greatly from its emphasis on tasks and common evaluation metrics. Common, standardized evaluation procedures have helped the community to focus research effort, to measure progress, and to encourage communication among participating sites. The task and the evaluation metrics, however, must be consistent with the goals of the Spoken Language program, namely interactive problem solving. Our evaluation methods have evolved with the technology, moving from evaluation of read speech from a fixed corpus through evaluation of isolated canned sentences to evaluation of spontaneous speech in context in a canned corpus. A key component missed in current evaluations is the role of subject interaction with the system. Because of the great variability across subjects, however, it is necessary to use either a large number of subjects or a within-subject design. This paper proposes a within-subject design comparing the results of a software-sharing exercise carried out jointly by MIT and SRI.