A comparison of neural networks for real-time emotionrecognition from speech signals A comparison of neural networks for real-time emotionrecognition from speech signals

Authors:
Mehmet S. Unluturk;Kaya Oguz;Coskun Atay
Affiliations:
Department of Software Engineering, Izmir University of Economics, Balcova, Izmir, Turkey;Department of Software Engineering, Izmir University of Economics, Balcova, Izmir, Turkey;Department of Software Engineering, Izmir University of Economics, Balcova, Izmir, Turkey
Venue:
WSEAS Transactions on Signal Processing
Year:
2009

Citing 8
Cited 1

Kendall's advanced theory of statistics

Kendall's advanced theory of statistics
Neural networks: algorithms, applications, and programming techniques

Neural networks: algorithms, applications, and programming techniques
Practical neural network recipes in C++

Practical neural network recipes in C++
The roots of backpropagation: from ordered derivatives to neural networks and political forecasting

The roots of backpropagation: from ordered derivatives to neural networks and political forecasting
Neural Networks for Optimization and Signal Processing

Neural Networks for Optimization and Signal Processing
Comparison between Fuzzy and NN Method for Speech Emotion Recognition

ICITA '05 Proceedings of the Third International Conference on Information Technology and Applications (ICITA'05) Volume 2 - Volume 02
Speech emotion recognition based on a hybrid of HMM/ANN

AIC'07 Proceedings of the 7th Conference on 7th WSEAS International Conference on Applied Informatics and Communications - Volume 7
Recognizing low/high anger in speech for call centers

ISPRA'08 Proceedings of the 7th WSEAS International Conference on Signal Processing, Robotics and Automation

Toward language-independent text-to-speech synthesis

WSEAS Transactions on Information Science and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Speech and emotion recognition improve the quality of human computer interaction and allow easier to use interfaces for every level of user in software applications. In this study, we have developed two different neural networks called emotion recognition neural network (ERNN) and Gram-Charlier emotion recognition neural network (GERNN) to classify the voice signals for emotion recognition. The ERNN has 128 input nodes, 20 hidden neurons, and three summing output nodes. A set of 97920 training sets is used to train the ERNN. A new set of 24480 testing sets is utilized to test the ERNN performance. The samples tested for voice recognition are acquired from the movies "Anger Management" and "Pick of Destiny". ERNN achieves an average recognition performance of 100%. This high level of recognition suggests that the ERNN is a promising method for emotion recognition in computer applications. Furthermore, the GERNN has four input nodes, 20 hidden neurons, and three output nodes. The GERNN achieves an average recognition performance of 33%. This shows us that we cannot use Gram-Charlier coefficients to discriminate emotion signals. In addition, Hinton diagrams were utilized to display the optimality of ERNN weights.