Cepstral domain segmental feature vector normalization for noise robust speech recognition
Speech Communication - Special issue on robust speech recognition
Automatic Cross-Biometric Footstep Database Labelling Using Speaker Recognition
ICB '09 Proceedings of the Third International Conference on Advances in Biometrics
Channel robust feature transformation based on filter-bank energy filtering
IEEE Transactions on Audio, Speech, and Language Processing
Optimizing acoustic features for source cell-phone recognition using speech signals
Proceedings of the first ACM workshop on Information hiding and multimedia security
Hi-index | 0.00 |
In speaker verification, it is necessary to reduce the influence of different environmental conditions. In this paper, two stages of normalization techniques, feature normalization and score normalization, are examined for decreasing the mismatch between training and testing acoustic conditions. At the first stage, cepstral mean and variance normalization (CMVN) is modified to normalize the cepstral coefficients with the similar segmental parameter statistics. Next, due to score variability between verification trials, Test-dependent zero-score normalization (TZnorm) and Zero-dependent test-score normalization (ZTnorm) are comparatively presented to transform the output scores entirely and make the speaker-independent decision threshold more robust under adverse conditions. Experiments on NIST2002 SRE corpus show that the normalizations with CMVN in feature stage and ZTnorm in score stage achieved 20.3% relative reduction of EER and 18.1% relative reduction of the minimal DCF compared to the baseline system using CMN and zero normalization.