ICAPR '01 Proceedings of the Second International Conference on Advances in Pattern Recognition
A posterior unionmodel with applications to robust speech and speaker recognition
EURASIP Journal on Applied Signal Processing
A tutorial on text-independent speaker verification
EURASIP Journal on Applied Signal Processing
Score Normalization Technique for Text-Prompted Speaker Verification with Chinese Digits
ICIC '07 Proceedings of the 3rd International Conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence
ICIC'07 Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications
Using MMSE to improve session variability estimation
International Journal of Biometrics
Exploiting glottal information in speaker recognition using parallel GMMs
AVBPA'05 Proceedings of the 5th international conference on Audio- and Video-Based Biometric Person Authentication
NOLISP'05 Proceedings of the 3rd international conference on Non-Linear Analyses and Algorithms for Speech Processing
Hi-index | 0.00 |
This paper presents an empirical study of the effects of handset variability on text-independent speaker recognition performance using the Switchboard corpus. Handset variability occurs when training speech is collected using one type of handset, but a different handset is used for collecting test speech. For the Switchboard corpus, the calling telephone number associated with a file is used to imply the handset used. Analysis of experiments designed to focus on handset variability on the SPIDRE database and the May95 NIST speaker recognition evaluation database, show that a performance gap between matched and mismatched handset tests persists even after applying several standard channel compensation techniques. Error rates for the mismatched tests are over 4 times those for the matched tests. Lastly, a new energy dependent cepstral mean subtraction technique is proposed to compensate for nonlinear distortions, but is not found to improve performance on the databases used.