Fundamentals of speech recognition
Fundamentals of speech recognition
Speaker identification and verification using Gaussian mixture speaker models
Speech Communication
The NIST speaker recognition evaluation - overview methodology, systems, results, perspective
Speech Communication - Speaker recognition and its commercial and forensic applications
Discriminative training of GMM for speaker identification
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Automated speech analysis applied to laryngeal disease categorization
Computer Methods and Programs in Biomedicine
Transformation-based GMM with improved cluster algorithm for speaker identification
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Audio based solutions for detecting intruders in wild areas
Signal Processing
Hi-index | 0.08 |
Gaussian mixture model (GMM) has been commonly used for text-independent speaker recognition. The estimation of model parameters is generally performed based on the maximum likelihood (ML) criterion. However, this criterion only utilizes the labeled utterances for each speaker model and very likely leads to a local optimization solution. To solve this problem, this paper proposes a discriminative training approach based on the maximum model distance (MMD) criterion. We investigate the characteristics of speaker recognition and further propose a novel selection strategy of competing speakers associated with it. Experimental results based on the KING and TIMIT databases demonstrate that our training approach was quite efficient to improve the performance of speaker identification and verification. When there were three training sentences for each speaker, the verification equal error rate (EER) of 168 speakers in TIMIT could be reduced by 30.4% compared with the conventional method.