Neural Computation
Towards a tool for the Subjective Assessment of Speech System Interfaces (SASSI)
Natural Language Engineering
Towards developing general models of usability with PARADISE
Natural Language Engineering
PARADISE: a framework for evaluating spoken dialogue agents
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Quality of Telephone-Based Spoken Dialogue Systems
Quality of Telephone-Based Spoken Dialogue Systems
Subjective Evaluation Method for Speech-Based Uni- and Multimodal Applications
PIT '08 Proceedings of the 4th IEEE tutorial and research workshop on Perception and Interactive Technologies for Speech-Based Systems: Perception in Multimodal Dialogue Systems
Comparing objective and subjective measures of usability in a human-robot dialogue system
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Modeling user satisfaction with Hidden Markov Model
SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Evaluating multimodal systems: a comparison of established questionnaires and interaction parameters
Proceedings of the 6th Nordic Conference on Human-Computer Interaction: Extending Boundaries
IWSDS'10 Proceedings of the Second international conference on Spoken dialogue systems for ambient environments
A user model to predict user satisfaction with spoken dialog systems
IWSDS'10 Proceedings of the Second international conference on Spoken dialogue systems for ambient environments
Modeling user satisfaction transitions in dialogues from overall ratings
SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Classifying dialogue in high-dimensional space
ACM Transactions on Speech and Language Processing (TSLP)
Which system differences matter?: using l1/l2 regularization to compare dialogue systems
SIGDIAL '11 Proceedings of the SIGDIAL 2011 Conference
Evaluating language understanding accuracy with respect to objective outcomes in a dialogue system
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
SDCTD '12 NAACL-HLT Workshop on Future Directions and Needs in the Spoken Dialog Community: Tools and Data
SDCTD '12 NAACL-HLT Workshop on Future Directions and Needs in the Spoken Dialog Community: Tools and Data
Hi-index | 0.00 |
In this paper, we compare different approaches for predicting the quality and usability of spoken dialogue systems. The respective models provide estimations of user judgments on perceived quality, based on parameters which can be extracted from interaction logs. Different types of input parameters and different modeling algorithms have been compared using three spoken dialogue databases obtained with two different systems. The results show that both linear regression models and classification trees are able to cover around 50% of the variance in the training data, and neural networks even more. When applied to independent test data, in particular to data obtained with different systems and/or user groups, the prediction accuracy decreases significantly. The underlying reasons for the limited predictive power are discussed. It is shown that - although an accurate prediction of individual ratings is not yet possible with such models - they may still be used for taking decisions on component optimization, and are thus helpful tools for the system developer.