Two stage emotion recognition based on speaking rate

  • Authors:
  • Shashidhar G. Koolagudi;Rao Sreenivasa Krothapalli

  • Affiliations:
  • School of Information Technology, Indian Institute of Technology Kharagpur, Kharagpur, India 721302;School of Information Technology, Indian Institute of Technology Kharagpur, Kharagpur, India 721302

  • Venue:
  • International Journal of Speech Technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper proposes two stage speech emotion recognition approach using speaking rate. The emotions considered in this study are anger, disgust, fear, happy, neutral, sadness, sarcastic and surprise. At the first stage, based on speaking rate, eight emotions are categorized into 3 broad groups namely active (fast), normal and passive (slow). In the second stage, these 3 broad groups are further classified into individual emotions using vocal tract characteristics. Gaussian mixture models (GMM) are used for developing the emotion models. Emotion classification performance at broader level, based on speaking rate is found to be around 99% for speaker and text dependent cases. Performance of overall emotion classification is observed to be improved using the proposed two stage approach. Along with spectral features, the formant features are explored in the second stage, to achieve robust emotion recognition performance in case of speaker, gender and text independent cases.