Toward A Speaker-Independent Real-Time Affect Detection System

  • Authors:
  • Rongqing Huang;Changxue Ma

  • Affiliations:
  • Motorola Labs Schaumburg, IL, USA;Motorola Labs Schaumburg, IL, USA

  • Venue:
  • ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 01
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The ability to detect the human affective states is rapidly gaining interests among researchers and industrial developers since it has a broad range of applications. This paper reports the advances of human affect detection from acoustic signals in Motorola Labs. We focus on two parts of affect detection: emotion detection and conversational engagement detection. The emotion detection part is the major component of our system. The system is based only on acoustic information, that is to say, there is no recognizer and no linguistic or semantic information available. Given the truth that speech is a short-time stationary signal, we employ the Hidden Markov Model (HMM) to capture the variation and trend of acoustic signal structures caused by affective states. The affect-sensitive segmental features such as pitch, energy, zero crossing rate and energy slope are extracted to capture the finer structures of acoustic signals. Each state of the HMM is modeled by a Gaussian Mixture Model (GMM), which captures the range, mean, median and variability of above affect-sensitive measures. Besides testing the algorithm in the LDC databases, we implement a real-time conversation monitor, which can recognize and express the eight basic human emotions and can detect the conversational engagement level.