Fundamentals of speech recognition
Fundamentals of speech recognition
Speaker identification and verification using Gaussian mixture speaker models
Speech Communication
Usefulness of the LPC-residue in text-independent speaker verification
Speech Communication
On the use of orthogonal GMM in speaker recognition
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Lips tracking biometrics for speaker recognition
International Journal of Biometrics
International Journal of Biometrics
Hi-index | 0.10 |
This paper introduces a novel method to extract robust features for text-independent speaker identification from short utterances. This method is perceptually motivated and inspired by the perceptual linear prediction (PLP) technique. The new feature is called perceptual log area ratio (PLAR). It is perceptual in the sense that it depends on notions from psychoacoustics where the robustness can be assured. Also, the log area ratio is an effective feature for recognizing speakers as it embodies the geometry and dynamics of the vocal tract, which are very much person-dependent. This research thus focuses on providing a reliable vocal biometric from speakers, which can be used effectively with full-band and telephone-band speech in noisy environments. Intensive performance analysis has been performed to benchmark the proposed method against the commonly-used features using different databases in different noisy environments. In almost all usable cases the PLAR proved its superiority over the commonly-used features such as MFCC and LPCC.