Speaker verification using excitation source information

Authors:
Debadatta Pati;S. R. Mahadeva Prasanna
Affiliations:
Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, India 781039;Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, India 781039
Venue:
International Journal of Speech Technology
Year:
2012

Citing 14
Cited 0

Decision Combination in Multiple Classifier Systems

IEEE Transactions on Pattern Analysis and Machine Intelligence
Speaker identification and verification using Gaussian mixture speaker models

Speech Communication
Usefulness of the LPC-residue in text-independent speaker verification

Speech Communication
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Speaker Identification Using Harmonic Structure of LP-residual Spectrum

AVBPA '97 Proceedings of the First International Conference on Audio- and Video-Based Biometric Person Authentication
Rapid and brief communication: Combining classifier decisions for robust speaker identification

Pattern Recognition
Extraction and representation of prosodic features for language and speaker recognition

Speech Communication
An overview of text-independent speaker recognition: From features to supervectors

Speech Communication
Modulation spectral features for robust far-field speaker identification

IEEE Transactions on Audio, Speech, and Language Processing
Subsegmental, segmental and suprasegmental processing of linear prediction residual for speaker information

International Journal of Speech Technology
Estimation of Glottal Closure Instants in Voiced Speech Using the DYPSA Algorithm

IEEE Transactions on Audio, Speech, and Language Processing
Event-Based Instantaneous Fundamental Frequency Estimation From Speech Signals

IEEE Transactions on Audio, Speech, and Language Processing
Epoch Extraction From Speech Signals

IEEE Transactions on Audio, Speech, and Language Processing
Discrimination Power of Vocal Source and Vocal Tract Related Features for Speaker Segmentation

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this work we develop a speaker recognition system based on the excitation source information and demonstrate its significance by comparing with the vocal tract information based system. The speaker-specific excitation information is extracted by the subsegmental, segmental and suprasegmental processing of the LP residual. The speaker-specific information from each level is modeled independently using Gaussian mixture modeling--universal background model (GMM-UBM) modeling and then combined at the score level. The significance of the proposed speaker recognition system is demonstrated by conducting speaker verification experiments on the NIST-03 database. Two different tests, namely, Clean test and Noisy test are conducted. In case of Clean test, the test speech signal is used as it is for verification. In case of Noisy test, the test speech is corrupted by factory noise (9 dB) and then used for verification. Even though for Clean test case, the proposed source based speaker recognition system still provides relatively poor performance than the vocal tract information, its performance is better for Noisy test case. Finally, for both clean and noisy cases, by providing different and robust speaker-specific evidences, the proposed system helps the vocal tract system to further improve the overall performance.