Speaker identification and verification using Gaussian mixture speaker models
Speech Communication
Pattern Recognition in Speech and Language Processing
Pattern Recognition in Speech and Language Processing
Introduction to Evolutionary Computing
Introduction to Evolutionary Computing
FGKA: a Fast Genetic K-means Clustering Algorithm
Proceedings of the 2004 ACM symposium on Applied computing
A Prototypes-Embedded Genetic K-means Algorithm
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
A tutorial on text-independent speaker verification
EURASIP Journal on Applied Signal Processing
Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Hi-index | 0.01 |
Speaker verification is usually formulated as a statistical hypothesis testing problem and solved by a log-likelihood ratio (LLR) test. A speaker verification system's performance is highly dependent on modeling the target speaker's voice (the null hypothesis) and characterizing non-target speakers' voices (the alternative hypothesis). However, since the alternative hypothesis involves unknown impostors, it is usually difficult to characterize a priori. In this paper, we propose a framework to better characterize the alternative hypothesis with the goal of optimally distinguishing the target speaker from impostors. The proposed framework is built on a weighted arithmetic combination (WAC) or a weighted geometric combination (WGC) of useful information extracted from a set of pre-trained background models. The parameters associated with WAC or WGC are then optimized using two discriminative training methods, namely, the minimum verification error (MVE) training method and the proposed evolutionary MVE (EMVE) training method, such that both the false acceptance probability and the false rejection probability are minimized. Our experiment results show that the proposed framework outperforms conventional LLR-based approaches.