The ATIS spoken language systems pilot corpus
HLT '90 Proceedings of the workshop on Speech and Natural Language
Elements of information theory
Elements of information theory
Fundamentals of speech recognition
Fundamentals of speech recognition
DISTBIC: a speaker-based segmentation for audio data indexing
Speech Communication - Special issue on accessing information in spoken audio
Complexity Measures of Supervised Classification Problems
IEEE Transactions on Pattern Analysis and Machine Intelligence
Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
Sum Versus Vote Fusion in Multiple Classifier Systems
IEEE Transactions on Pattern Analysis and Machine Intelligence
Gender identification using a general audio classifier
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 1
Speaker and gender normalization for continuous-density hidden Markov models
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Language independent gender identification
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Foreground auditory scene analysis for hearing aids
Pattern Recognition Letters
Multi-stage classification of emotional speech motivated by a dimensional emotion model
Multimedia Tools and Applications
Language independent voice-based gender identification system
Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
Pitch-based gender identification with two-stage classification
Security and Communication Networks
Digital Signal Processing
International Journal of Information Technology Project Management
Hi-index | 0.00 |
In the context of content-based multimedia indexing gender identification based on speech signal is an important task. In this paper a set of acoustic and pitch features along with different classifiers are compared for the problem of gender identification. We show that the fusion of features and classifiers performs better than any individual classifier. Based on such conclusions we built a system for gender identification in multimedia applications. The system uses a set of Neural Networks with acoustic and Pitch related features.90% of classification accuracy is obtained for 1 second segments and with independence to the language and the channel of the speech. Practical considerations, such as the continuity of speech and the use of mixture of experts instead of one single expert are shown to improve the classification accuracy to 93%. When used on a subset of the Switchboard database, the classification accuracy attains 98.5% for 5 seconds segments.