Speaker identification and verification using Gaussian mixture speaker models
Speech Communication
The nature of statistical learning theory
The nature of statistical learning theory
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Psychoacoustics: Facts and Models
Psychoacoustics: Facts and Models
Automatic Motherese Detection for Face-to-Face Interaction Analysis
Multimodal Signals: Cognitive and Algorithmic Issues
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Paralinguistics in speech and language-State-of-the-art and the challenge
Computer Speech and Language
Ten recent trends in computational paralinguistics
COST'11 Proceedings of the 2011 international conference on Cognitive Behavioural Systems
Hi-index | 0.00 |
In this paper we evaluate the relevance of a perceptual spectral model for automatic motherese detection. We investigated various classification techniques (Gaussian Mixture Models, Support Vector Machines, Neural network, k-nearest neighbors) often used in emotion recognition. Classification experiments were carried out with short manually pre-segmented speech and motherese segments extracted from family home movies (with a mean duration of approximately 3s). Accuracy of around 86% were obtained when tested on speaker-independent speech data and 87.5% in the last study with speaker-dependent. We found that GMM trained with spectral feature MFCC gives the best score since it outperforms all the single classifiers. We also found that a fusion between classifiers that use spectral features and classifiers that use prosodic information usually increases the performance for discrimination between motherese and normal-directed speech (around 86% accuracy).