Speaker identification and verification using Gaussian mixture speaker models
Speech Communication
Speaker recognition in reverberant enclosures
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Analysis of Feature Extraction and Channel Compensation in a GMM Speaker Recognition System
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Audio, Speech, and Language Processing
Speaker Identification Using Instantaneous Frequencies
IEEE Transactions on Audio, Speech, and Language Processing
Robust Speaker Recognition in Noisy Conditions
IEEE Transactions on Audio, Speech, and Language Processing
A non-intrusive quality and intelligibility measure of reverberant and dereverberated speech
IEEE Transactions on Audio, Speech, and Language Processing - Special issue on processing reverberant speech: methodologies and applications
Role of modulation magnitude and phase spectrum towards speech intelligibility
Speech Communication
EURASIP Journal on Audio, Speech, and Music Processing
Automatic speech emotion recognition using modulation spectral features
Speech Communication
Identifying speakers using their emotion cues
International Journal of Speech Technology
Disordered voice measurement and auditory analysis
Speech Communication
Speaker verification using excitation source information
International Journal of Speech Technology
Enhancing robustness for speech recognition through bio-inspired auditory filter-bank
International Journal of Bio-Inspired Computation
Gender-dependent emotion recognition based on HMMs and SPHMMs
International Journal of Speech Technology
International Journal of Speech Technology
Hi-index | 0.00 |
In this paper, auditory inspired modulation spectral features are used to improve automatic speaker identification (ASI) performance in the presence of room reverberation. The modulation spectral signal representation is obtained by first filtering the speech signal with a 23-channel gammatone filterbank. An eight-channel modulation filterbank is then applied to the temporal envelope of each gammatone filter output. Features are extracted from modulation frequency bands ranging from 3-15 Hz and are shown to be robust to mismatch between training and testing conditions and to increasing reverberation levels. To demonstrate the gains obtained with the proposed features, experiments are performed with clean speech, artificially generated reverberant speech, and reverberant speech recorded in a meeting room. Simulation results show that a Gaussian mixture model based ASI system, trained on the proposed features, consistently outperforms a baseline system trained on mel-frequency cepstral coefficients. For multimicrophone ASI applications, three multichannel score combination and adaptive channel selection techniques are investigated and shown to further improve ASI performance.