Audio-visual speaker verification using continuous fused HMMs
VisHCI '06 Proceedings of the HCSNet workshop on Use of vision in human-computer interaction - Volume 56
Robust face-voice based speaker identity verification using multilevel fusion
Image and Vision Computing
Reliability score based multimodal fusion for biometric person authentication
MATH'08 Proceedings of the American Conference on Applied Mathematics
Dynamic visual features for audio-visual speaker verification
Computer Speech and Language
Realistic visual speech synthesis based on hybrid concatenation method
IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions
Automatic temporal segment detection and affect recognition from face and body display
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on human computing
Feature Fusion Applied to Missing Data ASR with the Combination of Recognizers
Journal of Signal Processing Systems
Multi-view gymnastic activity recognition with fused HMM
ACCV'07 Proceedings of the 8th Asian conference on Computer vision - Volume Part I
A Bayesian network modeling approach for cross media analysis
Image Communication
Detecting DDoS attacks based on multi-stream fused HMM in source-end network
CANS'06 Proceedings of the 5th international conference on Cryptology and Network Security
Hi-index | 35.68 |
This paper presents a novel fused hidden Markov model (fused HMM) for integrating tightly coupled time series, such as audio and visual features of speech. In this model, the time series are first modeled by two conventional HMMs separately. The resulting HMMs are then fused together using a probabilistic fusion model, which is optimal according to the maximum entropy principle and a maximum mutual information criterion. Simulations and bimodal speaker verification experiments show that the proposed model can significantly reduce the recognition errors in noiseless or noisy environments.