Durations of Context-Dependent Phonemes: A New Feature in Speaker Verification

Authors:
Charl Johannes Heerden;Etienne Barnard
Affiliations:
University of Pretoria, Pretoria Gauteng, South Africa and Human Language Technology Group, Meraka Institute, CSIR, Meiring Naude Rd, Brumeria, Pretoria Gauteng, South Africa;University of Pretoria, Pretoria Gauteng, South Africa and Human Language Technology Group, Meraka Institute, CSIR, Meiring Naude Rd, Brumeria, Pretoria Gauteng, South Africa
Venue:
Speaker Classification II
Year:
2007

Citing 4
Cited 1

Speaker identification and verification using Gaussian mixture speaker models

Speech Communication
Corpora for the evaluation of speaker recognition systems

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Enhancing Speaker Discrimination at the Feature Level

Speaker Classification I
Evaluations of Automatic Speaker Classification Systems

Speaker Classification I

Frame Based Features

Speaker Classification I

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a text-dependent speaker verification system based on Hidden Markov Models. A set of features, based on the temporal duration of context-dependent phonemes, is used in order to distinguish amongst speakers. Our approach was tested using the YOHO corpus; it was found that the HMM-based system achieved an equal error rate (EER) of 0.68% using conventional (acoustic) features and an EER of 0.32% when the time features were combined with the acoustic features. This compares well with state-of-the-art results on the same test, and shows the value of the temporal features for speaker verification. These features may also be useful for other purposes, such as the detection of replay attacks, or for improving the robustness of speaker-verification systems to channel or speaker variations. Our results confirm earlier findings obtained on text-independent speaker recognition [1] and text-dependent speaker verification [2] tasks, and contain a number of suggestions on further possible improvements.