An evaluation understudy for dialogue coherence models

Authors:
Sudeep Gandhe;David Traum
Affiliations:
University of Southern California, Marina del Rey, CA;University of Southern California, Marina del Rey, CA
Venue:
SIGdial '08 Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
Year:
2008

Citing 14
Cited 3

ELIZA—a computer program for the study of natural language communication between man and machine

Communications of the ACM
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Towards a tool for the Subjective Assessment of Speech System Interfaces (SASSI)

Natural Language Engineering
Towards developing general models of usability with PARADISE

Natural Language Engineering
Discourse obligations in dialogue processing

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Probabilistic text structuring: experiments with sentence ordering

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Improving question-answering with linking dialogues

Proceedings of the 11th international conference on Intelligent user interfaces
Modeling local coherence: an entity-based approach

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Partially observable Markov decision processes for spoken dialog systems

Computer Speech and Language
Automatic Evaluation of Information Ordering: Kendall's Tau

Computational Linguistics
Discourse generation using utility-trained coherence models

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Inter-coder agreement for computational linguistics

Computational Linguistics
Agenda-based user simulation for bootstrapping a POMDP dialogue system

NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Inferring strategies for sentence ordering in multidocument news summarization

Journal of Artificial Intelligence Research

ADiEU: Toward Domain-Based Evaluation of Spoken Dialog Systems

Proceedings of the 13th International Conference on Human-Computer Interaction. Part I: New Trends
I've said it before, and I'll say it again: an empirical investigation of the upper bound of the selection approach to dialogue

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Assessing user simulation for dialog systems using human judges and automatic evaluation measures

Natural Language Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Evaluating a dialogue system is seen as a major challenge within the dialogue research community. Due to the very nature of the task, most of the evaluation methods need a substantial amount of human involvement. Following the tradition in machine translation, summarization and discourse coherence modeling, we introduce the the idea of evaluation understudy for dialogue coherence models. Following (Lapata, 2006), we use the information ordering task as a testbed for evaluating dialogue coherence models. This paper reports findings about the reliability of the information ordering task as applied to dialogues. We find that simple n-gram co-occurrence statistics similar in spirit to BLEU (Papineni et al., 2001) correlate very well with human judgments for dialogue coherence