Improving a GMM speaker verification system by phonetic weighting

Authors:
R. Auckenthaler;E. S. Parris;M. J. Carey
Affiliations:
Ensigma Ltd., Chepstow, UK;-;-
Venue:
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Year:
1999

Citing 0
Cited 9

Visual Speech: A Physiological or Behavioural Biometric?

AVBPA '01 Proceedings of the Third International Conference on Audio- and Video-Based Biometric Person Authentication
Audio-visual multimodal fusion for biometric person authentication and liveness verification

MMUI '05 Proceedings of the 2005 NICTA-HCSNet Multimodal User Interaction Workshop - Volume 57
Multifactor fusion for audio-visual speaker recognition

SSIP'07 Proceedings of the 7th WSEAS International Conference on Signal, Speech and Image Processing
On the use of complementary spectral features for speaker recognition

EURASIP Journal on Advances in Signal Processing
Text-independent speaker verification: state of the art and challenges

Progress in nonlinear speech processing
A segment selection technique for speaker verification

Speech Communication
Improving speaker verification using ALISP-Based specific GMMs

AVBPA'05 Proceedings of the 5th international conference on Audio- and Video-Based Biometric Person Authentication
Segmental scores fusion for ALISP-Based GMM text-independent speaker verification

Nonlinear Speech Modeling and Applications
VoCMex: a voice corpus in Mexican Spanish for research in speaker recognition

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper compares two approaches to speaker verification, Gaussian mixture models (GMMs) and hidden Markov models (HMMs). The GMM based system outperformed the HMM system, this was mainly due to the ability of the GMM to make better use of the training data. The best scoring GMM frames were strongly correlated with particular phonemes, e.g. vowels and nasals. Two techniques were used to try and exploit the different amounts of discrimination provided by the phonemes to improve the performance of the GMM based system. Applying linear weighting to the phonemes showed that less than half of the phonemes were contributing to the overall system performance. Using an MLP to weight the phonemes provided a significant improvement in performance for male speakers but no improvement has yet been achieved for women.