Is masking a relevant aspect lacking in MFCC? A speaker verification perspective

Authors:
Jugurta MontalvãO;Marcos Renato Rodrigues Araujo
Affiliations:
Universidade Federal de Sergipe, 49100-000 São Cristóvão, Brazil;Griaule Biometrics, Campinas, São Paulo, Brazil
Venue:
Pattern Recognition Letters
Year:
2012

Citing 10
Cited 0

Fundamentals of speech recognition

Fundamentals of speech recognition
Assessment for automatic speech recognition II: NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems

Speech Communication - Special issue on speech processing in adverse conditions
Comparison of different implementations of MFCC

Journal of Computer Science and Technology
Speaker verification using speaker- and test-dependent fast score normalization

Pattern Recognition Letters
SNR-dependent compression of enhanced Mel sub-band energies for compensation of noise effects on MFCC features

Pattern Recognition Letters
Real-Time Recognition of Spoken Words

IEEE Transactions on Computers
Handbook of Biometrics

Handbook of Biometrics
Text-independent speaker recognition using graph matching

Pattern Recognition Letters
Front-End Factor Analysis for Speaker Verification

IEEE Transactions on Audio, Speech, and Language Processing
Speaker and Session Variability in GMM-Based Speaker Verification

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.10

Visualization

Abstract

We hypothesize that spectral masking may account for most of the gains in robustness against noise using ensemble interval histogram (EIH) and zero crossing with peak amplitude (ZCPA) compared to Mel-frequency cepstral coefficients (MFCCs). To test this hypothesis, we focus on this issue by comparing two MFCC implementations for which the only difference is spectral masking. The comparison involved biometric speaker verification tasks using two publicly available databases. The results confirm the superiority of MFCC with masking, thus corroborating our hypotheses that masking is a key aspect for improved robustness in feature extraction.