PCM '02 Proceedings of the Third IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
HTIMIT and LLHDB: Speech Corpora for the Study of Handset Transducer Effects
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Hi-index | 0.00 |
This paper presents a cluster-based feature transformation technique for telephone-based speaker verification when labels of the handset types are not available during the training phase. The technique combines a cluster selector with cluster-dependent feature transformations to reduce the acoustic mismatches among different handsets. Specifically, a GMM-based cluster selector is trained to identify the cluster that best represents the handset used by a claimant. Handset distorted features are then transformed by cluster-specific feature transformation to remove the acoustic distortion before being presented to the clean speaker models. Experimental results show that cluster-dependent feature transformation with number of clusters larger than the actual number of handsets can achieve a performance level very close to that achievable by the handset-based transformation approaches.