Evaluating spoken language interaction

Authors:
Alexander I. Rudnicky;Michelle Sakamoto;Joseph H. Polifroni
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA
Venue:
HLT '89 Proceedings of the workshop on Speech and Natural Language
Year:
1989

Citing 2
Cited 3

The design of voice-driven interfaces

HLT '89 Proceedings of the workshop on Speech and Natural Language
Automatic Speech Recognition: The Development of the Sphinx Recognition System

Automatic Speech Recognition: The Development of the Sphinx Recognition System

Models for evaluating interaction protocols in speech recognition

CHI '91 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
The design of a spoken language interface

HLT '90 Proceedings of the workshop on Speech and Natural Language
Modelling non-verbal sounds for speech recognition

HLT '89 Proceedings of the workshop on Speech and Natural Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

To study the spoken language interface in the context of a complex problem-solving task, a group of users were asked to perform a spreadsheet task, alternating voice and keyboard input. A total of 40 tasks were performed by each participant, the first thirty in a group (over several days), the remaining ones a month later. The voice spreadsheet program used in this study was extensively instrumented to provide detailed information about the components of the interaction. These data, as well as analysis of the participants's utterances and recognizer output, provide a fairly detailed picture of spoken language interaction.Although task completion by voice took longer than by keyboard, analysis shows that users would be able to perform the spreadsheet task faster by voice, if two key criteria could be met: recognition occurs in real-time, and the error rate is sufficiently low. This initial experience with a spoken language system also allows us to identify several metrics, beyond those traditionally associated with speech recognition, that can be used to characterize system performance.