Adaptive speaker identification with audiovisual cues for movie content analysis
Pattern Recognition Letters - Video computing
Effects of time lapse on speaker recognition results
DSP'09 Proceedings of the 16th international conference on Digital Signal Processing
Hi-index | 0.00 |
This paper concerns effective speaker adaptation methods to solve the over-training problem in speaker verification, which frequently occurs when modeling a speaker with sparse training data. While various speaker adaptations have already been applied to speech recognition, these methods have not yet been formally considered in speaker verification. This paper proposes speaker adaptation methods using a combination of maximum a posteriori (MAP) and maximum likelihood linear regression (MLLR) adaptations, which are successfully used in speech recognition, and applies to speaker verification. Our aim is to remedy the small training data problem by investigating effective speaker adaptations for speaker modeling. Experimental results show that the speaker verification system using a weighted MAP and MLLR adaptation outperforms that of the conventional speaker models without adaptation by a factor of up to 5 times. From these results, we show that the speaker adaptation method achieves significantly better performance even when only small training data is available for speaker verification.