Different measurements metrics to evaluate a chatbot system

Authors:
Bayan Abu Shawar;Eric Atwell
Affiliations:
Arab Open University;University of Leeds, Leeds-UK
Venue:
NAACL-HLT-Dialog '07 Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies
Year:
2007

Citing 4
Cited 1

Lessons from a restricted Turing test

Communications of the ACM
The roles of language processing in a spoken language interface

Voice communication between humans and machines
ELIZA—a computer program for the study of natural language communication between man and machine

Communications of the ACM
Spoken dialogue technology: enabling the conversational user interface

ACM Computing Surveys (CSUR)

Towards a method for evaluating naturalness in conversational dialog systems

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics

Quantified Score

Hi-index	0.00

Visualization

Abstract

A chatbot is a software system, which can interact or "chat" with a human user in natural language such as English. For the annual Loebner Prize contest, rival chatbots have been assessed in terms of ability to fool a judge in a restricted chat session. We are investigating methods to train and adapt a chatbot to a specific user's language use or application, via a user-supplied training corpus. We advocate open-ended trials by real users, such as an example Afrikaans chatbot for Afrikaans-speaking researchers and students in South Africa. This is evaluated in terms of "glass box" dialogue efficiency metrics, and "black box" dialogue quality metrics and user satisfaction feedback. The other examples presented in this paper are the Qur'an and the FAQchat prototypes. Our general conclusion is that evaluation should be adapted to the application and to user needs.