Age and gender detection in the I-DASH project

Authors:
Hugo Meinedo;Isabel Trancoso
Affiliations:
L2F - Spoken Language Systems Lab, INESC-ID, Portugal;L2F - Spoken Language Systems Lab, INESC-ID and Instituto Superior Técnico, Portugal
Venue:
ACM Transactions on Speech and Language Processing (TSLP)
Year:
2011

Citing 3
Cited 0

Robust speech recognition using the modulation spectrogram

Speech Communication - Special issue on robust speech recognition
A study of speech recognition for children and the elderly

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
RASTA-PLP speech analysis technique

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article presents a description of the INESC-ID Age and Gender classification systems which were developed for aiding the detection of child abuse material within the scope of the European project I-DASH. The Age and Gender classification systems are composed respectively by the fusion of four and six individual subsystems trained with short- and long-term acoustic and prosodic features, different classification strategies, Gaussian Mixture Models-Universal Background Model (GMM-UBM), Multi-Layer Perceptrons (MLP) and Support Vector Machines (SVM), trained over five different speech corpus. The best results obtained by the calibration and linear logistic regression fusion back-end show an absolute improvement of 2% on the unweighted accuracy value for the Age and 1% for the Gender when compared to the best individual frontend systems in the development set. The final age/gender detection system evaluated using a six-hour child abuse (CA) test set achieved promising results given the extremely difficult conditions of this type of video material. In order to further improve the performance in the CA domain, the classification modules were adapted using unsupervised selection of training data. An automatic data selection algorithm using frame-level posterior probabilities was developed. Performance improvement after adapting the classification modules was around 10% relative when compared with the baseline classifiers.