Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
Speaker identification and verification using Gaussian mixture speaker models
Speech Communication
The NIST speaker recognition evaluation - overview methodology, systems, results, perspective
Speech Communication - Speaker recognition and its commercial and forensic applications
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Robust speaker verification via fusion of speech and lip modalities
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 06
Fusion of face and speech data for person identity verification
IEEE Transactions on Neural Networks
Hi-index | 0.10 |
In this letter we propose a piece-wise linear (PL) classifier for use as the decision stage in a two-modal verification system, comprised of a face and a speech expert. The classifier utilizes a fixed decision boundary that has been specifically designed to account for the effects of noisy audio conditions. Experimental results on the VidTIMIT database show that in clean conditions, the proposed classifier is outperformed by a traditional weighted summation decision stage (using both fixed and adaptive weights). Using white Gaussian noise to corrupt the audio data resulted in the PL classifier obtaining better performance than the fixed approach and similar performance to the adaptive approach. Using a more realistic noise type, namely "operations room" noise from the NOISEX-92 corpus, resulted in the PL classifier obtaining better performance than both the fixed and adaptive approaches. The better results in this case stem from the PL classifier not making a direct assumption about the type of noise that causes the mismatch between training and testing conditions (unlike the adaptive approach). Moreover, the PL classifier has the advantage of having a fixed (non-adaptive, thus simpler) structure.