Characterization and recognition of emotions from speech using excitation source information

Authors:
Sreenivasa Rao Krothapalli;Shashidhar G. Koolagudi
Affiliations:
School of Information Technology, Indian Institute of Technology Kharagpur, Kharagpur, India 721302;School of Information Technology, Indian Institute of Technology Kharagpur, Kharagpur, India 721302
Venue:
International Journal of Speech Technology
Year:
2013

Citing 26
Cited 0

Fundamentals of speech recognition

Fundamentals of speech recognition
Principal component neural networks: theory and applications

Principal component neural networks: theory and applications
Emotional stress in synthetic speech: progress and future directions

Speech Communication - Special issue on speech under stress
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
A corpus-based speech synthesis system with emotion

Speech Communication - Special issue on speech and emotion
The role of voice quality in communicating emotion, mood and attitude

Speech Communication - Special issue on speech and emotion
Artificial Neural Networks

Artificial Neural Networks
Intonation modeling for Indian languages

Computer Speech and Language
Emotion Recognition Based on Physiological Changes in Music Listening

IEEE Transactions on Pattern Analysis and Machine Intelligence
Exploiting a Vowel Based Approach for Acted Emotion Recognition

Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction
Towards an Emotion Recognition System Based on Biometrics

CSO '09 Proceedings of the 2009 International Joint Conference on Computational Sciences and Optimization - Volume 01
Features extraction for speech emotion

Journal of Computational Methods in Sciences and Engineering
Determining mixing parameters from multispeaker data using speech-specific information

IEEE Transactions on Audio, Speech, and Language Processing
Applying Articulatory Features to Speech Emotion Recognition

ICRCCS '09 Proceedings of the 2009 International Conference on Research Challenges in Computer Science
Spoken emotion recognition through optimum-path forest classification using glottal features

Computer Speech and Language
Class-level spectral features for emotion recognition

Speech Communication
Spectral mapping using artificial neural networks for voice conversion

IEEE Transactions on Audio, Speech, and Language Processing
Application of prosody models for developing speech systems in Indian languages

International Journal of Speech Technology
Two stage emotion recognition based on speaking rate

International Journal of Speech Technology
Recognition of emotions from video using neural network models

Expert Systems with Applications: An International Journal
Vowel Onset Point Detection Using Source, Spectral Peaks, and Modulation Spectrum Energies

IEEE Transactions on Audio, Speech, and Language Processing
Epoch Extraction From Speech Signals

IEEE Transactions on Audio, Speech, and Language Processing
Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence

IEEE Transactions on Audio, Speech, and Language Processing
Emotion recognition from speech: a review

International Journal of Speech Technology
Emotion recognition from speech using source, system, and prosodic features

International Journal of Speech Technology
Emotion recognition from speech using global and local prosodic features

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper explores the excitation source features of speech production mechanism for characterizing and recognizing the emotions from speech signal. The excitation source signal is obtained from speech signal using linear prediction (LP) analysis, and it is also known as LP residual. Glottal volume velocity (GVV) signal is also used to represent excitation source, and it is derived from LP residual signal. Speech signal has high signal to noise ratio around the instants of glottal closure (GC). These instants of glottal closure are also known as epochs. In this paper, the following excitation source features are proposed for characterizing and recognizing the emotions: sequence of LP residual samples and their phase information, parameters of epochs and their dynamics at syllable and utterance levels, samples of GVV signal and its parameters. Auto-associative neural networks (AANN) and support vector machines (SVM) are used for developing the emotion recognition models. Telugu and Berlin emotion speech corpora are used to evaluate the developed models. Anger, disgust, fear, happy, neutral and sadness are the six emotions considered in this study. About 42 % to 63 % of average emotion recognition performance is observed using different excitation source features. Further, the combination of excitation source and spectral features has shown to improve the emotion recognition performance up to 84 %.