Relations between de-facto criteria in the evaluation of a spoken dialogue system

Authors:
Zoraida Callejas;Ramón López-Cózar
Affiliations:
Department of Languages and Computer Systems, Faculty of Computer Science and Telecommunications, University of Granada, C/ Pdta. Daniel Saucedo Aranda s/n, 18071 Granada, Spain;Department of Languages and Computer Systems, Faculty of Computer Science and Telecommunications, University of Granada, C/ Pdta. Daniel Saucedo Aranda s/n, 18071 Granada, Spain
Venue:
Speech Communication
Year:
2008

Citing 16
Cited 6

Field trials of the Italian ARISE train timetable system

Speech Communication - Special issue on interactive voice technology for telecommunication applications
Designing and Evaluating an Adaptive Spoken Dialogue System

User Modeling and User-Adapted Interaction
Why do people use information technology?: a critical review of the technology acceptance model

Information and Management
User evaluation of the MASK kiosk

Speech Communication
Usability issues in spoken dialogue systems

Natural Language Engineering
Towards developing general models of usability with PARADISE

Natural Language Engineering
Learning to predict problematic situations in a spoken dialogue system: experiments with how may I help you?

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Telephone-linked care for physical activity: a qualitative evaluation of the use patterns of an information technology program for patients

Journal of Biomedical Informatics - Special issue: Human-centered computing in health information systems. Part 2: Evaluation
Empirical methods for evaluating dialog systems

ELDS '01 Proceedings of the workshop on Evaluation for Language and Dialogue Systems - Volume 9
A new taxonomy for the quality of telephone services based on spoken dialogue systems

SIGDIAL '02 Proceedings of the 3rd SIGdial workshop on Discourse and dialogue - Volume 2
A mobile multimodal dialogue system for public transportation navigation evaluated

Proceedings of the 8th conference on Human-computer interaction with mobile devices and services
SmartKom: Foundations of Multimodal Dialogue Systems (Cognitive Technologies)

SmartKom: Foundations of Multimodal Dialogue Systems (Cognitive Technologies)
Health dialog systems for patients and consumers

Journal of Biomedical Informatics - Special issue: Dialog systems for health communications
Natural and Intuitive Multimodal Dialogue for In-Car Applications: The Sammie System

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
A framework for evaluating the usability of spoken language dialog systems (SLDSs)

UI-HCII'07 Proceedings of the 2nd international conference on Usability and internationalization
Quality of Telephone-Based Spoken Dialogue Systems

Quality of Telephone-Based Spoken Dialogue Systems

A comparison between dialog corpora acquired with real and simulated users

SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Bringing together commercial and academic perspectives for the development of intelligent AmI interfaces

Journal of Ambient Intelligence and Smart Environments - A software engineering perspective on smart applications for AmI
Towards building intelligent speech interfaces through the use of more flexible, robust and natural dialogue management solutions

Interacting with Computers
A satisfaction-based model for affect recognition from conversational features in spoken dialog systems

Speech Communication
A domain-independent statistical methodology for dialog management in spoken dialog systems

Computer Speech and Language
A statistical simulation technique to develop and evaluate conversational agents

AI Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Evaluation of spoken dialogue systems has been traditionally carried out in terms of instrumentally or expert-derived measures (usually called ''objective'' evaluation) and quality judgments of users who have previously interacted with the system (also called ''subjective'' evaluation). Different research efforts have been made to extract relationships between these evaluation criteria. In this paper we report empirical results obtained from statistical studies, which were carried out on interactions of real users with our spoken dialogue system. These studies have rarely been exploited in the literature. Our results show that they can indicate important relationships between criteria, which can be used as guidelines for refinement of the systems under evaluation, as well as contributing to the state-of-the-art knowledge about how quantitative aspects of the systems affect the user's perceptions about them.