Speaker recognition using G.729 speech codec parameters
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Likelihood Ratio-Based Biometric Score Fusion
IEEE Transactions on Pattern Analysis and Machine Intelligence
An efficient speech recognition system in adverse conditions using the nonparametric regression
Engineering Applications of Artificial Intelligence
Score normalization in multimodal biometric systems
Pattern Recognition
MMSE-based packet loss concealment for CELP-coded speech recognition
IEEE Transactions on Audio, Speech, and Language Processing
Incorporating Model-Specific Score Distribution in Speaker Verification Systems
IEEE Transactions on Audio, Speech, and Language Processing
Speaker recognition from encrypted VoIP communications
Digital Investigation: The International Journal of Digital Forensics & Incident Response
Hi-index | 0.00 |
A novel approach, based on robust regression with normalized score fusion (namely Normalized Scores following Robust Regression Fusion: NSRRF), is proposed for enhancement of speaker recognition over IP networks, which can be used both in Network Speaker Recognition (NSR) and Distributed Speaker Recognition (DSR) systems. In this framework, it is basically assumed that the speech must be encoded by G729 coder in client side, and then, transmitted at a server side, where the ASR systems are located. The Universal Background Gaussian Mixture Model (GMM-UBM) and Gaussian Supervector (GMM-SVM) with normalized scores are used for speaker recognition. In this work, Mel Frequency Cepstral Coefficient (MFCC) and Linear Prediction Cepstral Coefficient (LPCC), both of these features are derived from Line Spectral Pairs (LSP) extracted from G729 bit-stream over IP, constitute the features vectors. Experimental results, conducted with the LIA SpkDet system based on the ALIZE platform3 using ARADIGITS database, have shown in first that the proposed method using features extracted directly from G729 bit-stream reduces significantly the error rate and outperforms the baseline system in ASR over IP based on the resynthesized (reconstructed) speech obtained from the G729 decoder. In addition, the obtained results show that the proposed approach, based on scores normalization following robust regression fusion technique, achieves the best result and outperform the conventional ASR over IP network.