Audio features selection for automatic height estimation from speech

Authors:
Todor Ganchev;Iosif Mporas;Nikos Fakotakis
Affiliations:
Artificial Intelligence Group, Wire Communications Laboratory, Dept of Electrical and Computer Engineering, University of Patras, Rion-Patras, Greece;Artificial Intelligence Group, Wire Communications Laboratory, Dept of Electrical and Computer Engineering, University of Patras, Rion-Patras, Greece;Artificial Intelligence Group, Wire Communications Laboratory, Dept of Electrical and Computer Engineering, University of Patras, Rion-Patras, Greece
Venue:
SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
Year:
2010

Citing 4
Cited 1

An adaptation of Relief for attribute estimation in regression

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
New Support Vector Algorithms

Neural Computation
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Pattern Recognition, Fourth Edition

Pattern Recognition, Fourth Edition

Automatic estimation of the first three subglottal resonances from adults' speech signals with application to speaker height estimation

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

Aiming at the automatic estimation of the height of a person from speech, we investigate the applicability of various subsets of speech features, which were formed on the basis of ranking the relevance and the individual quality of numerous audio features Specifically, based on the relevance ranking of the large set of openSMILE audio descriptors, we performed selection of subsets with different sizes and evaluated them on the height estimation task In brief, during the speech parameterization process, every input utterance is converted to a single feature vector, which consists of 6552 parameters Next, a subset of this feature vector is fed to a support vector machine (SVM)-based regression model, which aims at the straight estimation of the height of an unknown speaker The experimental evaluation performed on the TIMIT database demonstrated that: (i) the feature vector composed of the top-50 ranked parameters provides a good trade-off between computational demands and accuracy, and that (ii) the best accuracy, in terms of mean absolute error and root mean square error, is observed for the top-200 subset.