Fundamentals of speech recognition
Fundamentals of speech recognition
Variational Gaussian Mixture Models for Speech Emotion Recognition
ICAPR '09 Proceedings of the 2009 Seventh International Conference on Advances in Pattern Recognition
Combination of generative models and SVM based classifier for speech emotion recognition
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Relevance vector machine based speech emotion recognition
ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II
Hi-index | 0.00 |
Classification of multi-variate time series data of varying length finds applications in various domains of science and technology. There are two paradigms for modeling multi-variate varying length time series, namely, modeling the sequences of feature vectors and modeling the sets of feature vectors in the time series. In tasks such as text independent speaker recognition, audio clip classification and speech emotion recognition, modeling temporal dynamics is not critical and there may not be any underlying constraint in the time series. Gaussian mixture models (GMM) are commonly used for these tasks. In this paper, we propose a method based on descriptive statistical features for multi-variate varying length time series classification. The proposed method reduces the dimensionality of representation significantly and is less sensitive to missing samples. The proposed method is applied on speech emotion recognition and audio clip classification. The performance is compared with that of the GMMs based approaches that use maximum likelihood method and variational Bayes method for parameter estimation, and two approaches that combine GMMs and SVMs, namely, score vector based approach and segment modeling based approach. The proposed method is shown to give a better performance compared to all other methods.