Empirically evaluating an adaptable spoken dialogue system
UM '99 Proceedings of the seventh international conference on User modeling
A computational architecture for conversation
UM '99 Proceedings of the seventh international conference on User modeling
Designing Interactive Speech Systems: From First Ideas to User Testing
Designing Interactive Speech Systems: From First Ideas to User Testing
Conversation as Action Under Uncertainty
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
PARADISE: a framework for evaluating spoken dialogue agents
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Towards context-adaptive utterance interpretation
SIGDIAL '02 Proceedings of the 3rd SIGdial workshop on Discourse and dialogue - Volume 2
Evaluating multimodal systems: a comparison of established questionnaires and interaction parameters
Proceedings of the 6th Nordic Conference on Human-Computer Interaction: Extending Boundaries
Hi-index | 0.00 |
We examine what purpose a dialog metric serves and then propose empirical methods for evaluating systems that meet that purpose. The methods include a protocol for conducting a wizard-of-oz experiment and a basic set of descriptive statistics for substantiating performance claims using the data collected from the experiment as an ideal benchmark or "gold standard" for comparative judgments. The methods also provide a practical means of optimizing the system through component analysis and cost valuation.