Dimension reduction by local principal component analysis
Neural Computation
Coupled hidden Markov models for complex action recognition
CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Noise adaptive stream weighting in audio-visual speech recognition
EURASIP Journal on Applied Signal Processing
An introduction to biometric recognition
IEEE Transactions on Circuits and Systems for Video Technology
Hi-index | 0.00 |
Speaker recognition in real environment with reliable mode is a key challenge for ubiquitous service in human computer interface. In this paper, we present a robust multimodal speaker identification system with optimized reliability of different modalities. We propose an extension of modified convection function's optimizing factors to account optimum reliability simultaneously in audio, face and lip information. The proposed reliability measure is applied to a multimodal speaker identification framework for robust speaker identification. Particle swarm optimization (PSO) algorithm has been employed to optimize the modified convection function's optimizing factors. In the face-based expert, the image quality has been degraded with jpeg compression technique in enrollment and test session. Similarly, Lip-based expert's image quality also degraded to create mismatch in enrollment and test image. Finally, an artificial illumination in opposite direction has been added to test face and lip image with different intensities, respectively. The VidTimit audio DB was collected in office environment has a high level of signal distortion. We have applied local principal component analysis (Local PCA) to both face and lip modalities for reducing the dimension of features vector. The overall speaker identification experiments are performed using VidTimit DB. Experimental results show that our proposed optimum reliability measures effectively enhanced the identification rate (IR) of 8.67% in comparison with the best classifier system i.e., audio classifier and most notably retained the consistency of multimodal integration framework.