Combining acoustic features for improved emotion recognition in mandarin speech

  • Authors:
  • Tsang-Long Pao;Yu-Te Chen;Jun-Heng Yeh;Wen-Yuan Liao

  • Affiliations:
  • Department of Computer Science and Engineering, Tatung University;Department of Computer Science and Engineering, Tatung University;Department of Computer Science and Engineering, Tatung University;Department of Computer Science and Engineering, Tatung University

  • Venue:
  • ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

Combining different feature streams to obtain a more accurate experimental result is a well-known technique. The basic argument is that if the recognition errors of systems using the individual streams occur at different points, there is at least a chance that a combined system will be able to correct some of these errors by reference to the other streams. In the emotional speech recognition system, there are many ways in which this general principle can be applied. In this paper, we proposed using feature selection and feature combination to improve the speaker-dependent emotion recognition in Mandarin speech. Five basic emotions are investigated including anger, boredom, happiness, neutral and sadness. Combining multiple feature streams is clearly highly beneficial in our system. The best accuracy recognizing five different emotions can be achieved 99.44% by using MFCC, LPCC, RastaPLP, LFPC feature streams and the nearest class mean classifier.