Voice-based gender identification in multimedia applications

Authors:
Hadi Harb;Liming Chen
Affiliations:
LIRIS CNRS FRE, Dept. Mathématiques Informatique, Ecole Centrale de Lyon, Ecully Cedex, France;LIRIS CNRS FRE, Dept. Mathématiques Informatique, Ecole Centrale de Lyon, Ecully Cedex, France
Venue:
Journal of Intelligent Information Systems - Special issue: Intelligent multimedia applications
Year:
2005

Citing 11
Cited 6

The ATIS spoken language systems pilot corpus

HLT '90 Proceedings of the workshop on Speech and Natural Language
Elements of information theory

Elements of information theory
Fundamentals of speech recognition

Fundamentals of speech recognition
DISTBIC: a speaker-based segmentation for audio data indexing

Speech Communication - Special issue on accessing information in spoken audio
Complexity Measures of Supervised Classification Problems

IEEE Transactions on Pattern Analysis and Machine Intelligence
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Sum Versus Vote Fusion in Multiple Classifier Systems

IEEE Transactions on Pattern Analysis and Machine Intelligence
Gender identification using a general audio classifier

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 1
Speaker and gender normalization for continuous-density hidden Markov models

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Language independent gender identification

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Dynamic classifier combination in hybrid speech recognition systems using utterance-level confidence values

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02

Foreground auditory scene analysis for hearing aids

Pattern Recognition Letters
Multi-stage classification of emotional speech motivated by a dimensional emotion model

Multimedia Tools and Applications
Language independent voice-based gender identification system

Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
Pitch-based gender identification with two-stage classification

Security and Communication Networks
Acoustic classification and segmentation using modified spectral roll-off and variance-based features

Digital Signal Processing
An Automated Workforce Clustering Method for Business Process Reengineering in Research and Development Organizations

International Journal of Information Technology Project Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the context of content-based multimedia indexing gender identification based on speech signal is an important task. In this paper a set of acoustic and pitch features along with different classifiers are compared for the problem of gender identification. We show that the fusion of features and classifiers performs better than any individual classifier. Based on such conclusions we built a system for gender identification in multimedia applications. The system uses a set of Neural Networks with acoustic and Pitch related features.90% of classification accuracy is obtained for 1 second segments and with independence to the language and the channel of the speech. Practical considerations, such as the continuity of speech and the use of mixture of experts instead of one single expert are shown to improve the classification accuracy to 93%. When used on a subset of the Switchboard database, the classification accuracy attains 98.5% for 5 seconds segments.