On the Use of Kappa Coefficients to Measure the Reliability of the Annotation of Non-acted Emotions

Authors:
Zoraida Callejas;Ramón López-Cózar
Affiliations:
Dept. of Languages and Computer Systems, , Granada, Spain 18071;Dept. of Languages and Computer Systems, , Granada, Spain 18071
Venue:
PIT '08 Proceedings of the 4th IEEE tutorial and research workshop on Perception and Interactive Technologies for Speech-Based Systems: Perception in Multimodal Dialogue Systems
Year:
2008

Citing 3
Cited 3

2005 Special Issue: Challenges in real-life emotion annotation and machine learning based detection

Neural Networks - Special issue: Emotion and brain
Ensemble methods for spoken emotion recognition in call-centres

Speech Communication
Real-Life emotion representation and detection in call centers data

ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction

Analysis and Assessment of AvID: Multi-Modal Emotional Database

TSD '09 Proceedings of the 12th International Conference on Text, Speech and Dialogue
I feel you: towards affect-sensitive domotic spoken conversational agents

IWAAL'12 Proceedings of the 4th international conference on Ambient Assisted Living and Home Care
A satisfaction-based model for affect recognition from conversational features in spoken dialog systems

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we study the impact of three main factors on measuring the reliability of the annotation of non-acted emotions: the annotator biases, the similarity between the classified emotions, and the usage of contextual information during the annotation. We employed a corpus collected from real interactions between users and a spoken dialogue system. The user utterances were classified by nine non-expert annotators into four categories. We discuss the problems that the nature of non-acted emotional corpora impose in evaluating the reliability of the annotations using Kappa coefficients. Although deeply affected by the so-called paradoxes of Kappa coefficients, our study shows how taking into account context information and similarity between emotions helps to obtain values closer to the maximum agreement rates attainable, and allow the detection of emotions which are expressed more subtly by the users.