Subject-based evaluation measures for interactive spoken language systems

Authors:
Patti Price;Lynette Hirschman;Elizabeth Shriberg;Elizabeth Wade
Affiliations:
SRI International, Menlo Park, CA;MIT Laboratory for Computer Science, Cambridge, MA;University of California at Berkeley, Berkeley, CA;Stanford University, Stanford, CA
Venue:
HLT '91 Proceedings of the workshop on Speech and Natural Language
Year:
1992

Citing 12
Cited 5

A template matcher for robust NL interpretation

HLT '91 Proceedings of the workshop on Speech and Natural Language
Interactive problem solving and dialogue in the ATIS domain

HLT '91 Proceedings of the workshop on Speech and Natural Language
Evaluation of spoken language systems: the ATIS domain

HLT '90 Proceedings of the workshop on Speech and Natural Language
The ATIS spoken language systems pilot corpus

HLT '90 Proceedings of the workshop on Speech and Natural Language
Developing an evaluation methodology for spoken language systems

HLT '90 Proceedings of the workshop on Speech and Natural Language
Beyond class A: a proposal for automatic evaluation of discourse

HLT '90 Proceedings of the workshop on Speech and Natural Language
Designing the human machine interface in the ATIS domain

HLT '90 Proceedings of the workshop on Speech and Natural Language
Data collection and analysis in the air travel planning domain

HLT '89 Proceedings of the workshop on Speech and Natural Language
Overview of the fifth DARPA speech and natural language workshop

HLT '91 Proceedings of the workshop on Speech and Natural Language
Experiments in evaluating interactive spoken language systems

HLT '91 Proceedings of the workshop on Speech and Natural Language
Human-machine problem solving using spoken language systems (SLS): factors affecting performance and user satisfaction

HLT '91 Proceedings of the workshop on Speech and Natural Language
Spontaneous speech effects in large vocabulary speech recognition applications

HLT '91 Proceedings of the workshop on Speech and Natural Language

Evaluating message understanding systems: an analysis of the third message understanding conference (MUC-3)

Computational Linguistics
Quantitative and qualitative evaluation of Darpa Communicator spoken dialogue systems

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
A new taxonomy for the quality of telephone services based on spoken dialogue systems

SIGDIAL '02 Proceedings of the 3rd SIGdial workshop on Discourse and dialogue - Volume 2
Evaluating interactive dialogue systems: extending component evaluation to integrated system evaluation

ISDS '97 Interactive Spoken Dialog Systems on Bringing Speech and NLP Together in Real Applications
Evaluation of a domain independent approach to natural language processing for game-like user interfaces

CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games

Quantified Score

Hi-index	0.01

Visualization

Abstract

The DARPA Spoken Language effort has profited greatly from its emphasis on tasks and common evaluation metrics. Common, standardized evaluation procedures have helped the community to focus research effort, to measure progress, and to encourage communication among participating sites. The task and the evaluation metrics, however, must be consistent with the goals of the Spoken Language program, namely interactive problem solving. Our evaluation methods have evolved with the technology, moving from evaluation of read speech from a fixed corpus through evaluation of isolated canned sentences to evaluation of spontaneous speech in context in a canned corpus. A key component missed in current evaluations is the role of subject interaction with the system. Because of the great variability across subjects, however, it is necessary to use either a large number of subjects or a within-subject design. This paper proposes a within-subject design comparing the results of a software-sharing exercise carried out jointly by MIT and SRI.