Contrasting emotion-bearing laughter types in multiparticipant vocal activity detection for meetings

  • Authors:
  • Kornel Laskowski

  • Affiliations:
  • Language Technologies Institute, Carnegie Mellon University, Pittsburgh PA, USA

  • Venue:
  • ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The detection of laughter in conversational interaction presents an important challenge in meeting understanding, important primarily because laughter is predictive of the emotional state of participants. We present evidence which suggests that ignoring unvoiced laughter improves the prediction of emotional involvement in collocated speech, making a case for the distinction between voiced and unvoiced laughter during laughter detection. Our experiments show that the exclusion of unvoiced laughter during laughter model training as well as its explicit modeling lead to detection scores for voiced laughter which are much higher than those otherwise obtained for all laughter. Furthermore, duration modeling is shown to be a more effective means of improving precision than interaction modeling through joint-participant decoding. Taken together, the final detection F-scores we present for voiced laughter on our development set comprise a 20% reduction of error, relative to F-scores for all laughter reported in previous work, and 6% and 22% relative reductions in error on two larger datasets unseen during development.