Contrasting emotion-bearing laughter types in multiparticipant vocal activity detection for meetings

Authors:
Kornel Laskowski
Affiliations:
Language Technologies Institute, Carnegie Mellon University, Pittsburgh PA, USA
Venue:
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Year:
2009

Citing 0
Cited 4

Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge

Speech Communication
Spotting laughter in natural multiparty conversations: A comparison of automatic online and offline approaches using audiovisual data

ACM Transactions on Interactive Intelligent Systems (TiiS) - Special Issue on Affective Interaction in Natural Environments
Paralinguistics in speech and language-State-of-the-art and the challenge

Computer Speech and Language
The MAHNOB Laughter database

Image and Vision Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The detection of laughter in conversational interaction presents an important challenge in meeting understanding, important primarily because laughter is predictive of the emotional state of participants. We present evidence which suggests that ignoring unvoiced laughter improves the prediction of emotional involvement in collocated speech, making a case for the distinction between voiced and unvoiced laughter during laughter detection. Our experiments show that the exclusion of unvoiced laughter during laughter model training as well as its explicit modeling lead to detection scores for voiced laughter which are much higher than those otherwise obtained for all laughter. Furthermore, duration modeling is shown to be a more effective means of improving precision than interaction modeling through joint-participant decoding. Taken together, the final detection F-scores we present for voiced laughter on our development set comprise a 20% reduction of error, relative to F-scores for all laughter reported in previous work, and 6% and 22% relative reductions in error on two larger datasets unseen during development.