Spoken emotion recognition using glottal symmetry

Authors:
Alexander I. Iliev;Michael S. Scordilis
Affiliations:
Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL;Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL
Venue:
EURASIP Journal on Advances in Signal Processing - Special issue on emotion and mental state recognition from speech
Year:
2011

Citing 8
Cited 1

An overview of the CAVE project research activities in speaker verification

Speech Communication - Speaker recognition and its commercial and forensic applications
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Discrete-time speech signal processing: principles and practice

Discrete-time speech signal processing: principles and practice
Spoken emotion recognition through optimum-path forest classification using glottal features

Computer Speech and Language
Application of the analysis of glottal excitation of stressed speech to speaking style modification

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
A novel source analysis method by matching spectral characters of LF model with STRAIGHT spectrum

ACII'05 Proceedings of the First international conference on Affective Computing and Intelligent Interaction
Estimation of Glottal Closure Instants in Voiced Speech Using the DYPSA Algorithm

IEEE Transactions on Audio, Speech, and Language Processing
A quantitative assessment of group delay methods for identifying glottal closures in voiced speech

IEEE Transactions on Audio, Speech, and Language Processing

Emotion recognition from speech: a review

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Speech variability in real-world situations makes spoken emotion recognition a challenging task. While a variety of temporal and spectral speech features have been proposed, this paper investigates the effectiveness of using the glottal airflow signal in recognizing emotions. The speech used in this investigation is from a classical recording of the theatrical play "Waiting for Godot" by Samuel Beckett. Six emotions were investigated: happy, angry, sad, fear, surprise, and neutral. The proposed method was tested on the original recording and on simulated distortion conditions. In clean signal conditions the proposedmethod achieved average recognition rates of 76% for four emotions and 66.5% for all six emotions. Furthermore, it proved fairly robust under signal distortion and noisy conditions achieving recognition rates of 60% for four and 51.6% for six emotions for severely low-pass filtered speech, while with additive white Gaussian noise at SNR = 10 dB recognition rates were 53% and 47% for the four and sixemotion tasks, respectively. Results indicate that glottal signal features provide good separation of spoken emotions and achieve enhanced classification performance when compared to other approaches.