Robust Real-Time Face Detection
International Journal of Computer Vision
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
EURASIP Journal on Applied Signal Processing
An iterative image registration technique with an application to stereo vision
IJCAI'81 Proceedings of the 7th international joint conference on Artificial intelligence - Volume 2
Speaker association with signal-level audiovisual fusion
IEEE Transactions on Multimedia
Hi-index | 0.10 |
Audio-to-video synchronization (AV-sync) may drift and is difficult to recover without time-consuming efforts. Based on analysis of audiovisual correlations, we developed a method of recovering drifted AV-sync in a video clip with only minor human interactions. Users just need to specify the time window for a stationary speaker. We search the optimum drift within this time window that maximizes the average audiovisual correlation inside the speaker region by shifting audio and computing the correlation for different drift hypotheses, and then recover AV-sync based on the refined optimum drift. The audiovisual correlation was analyzed by Quadratic Mutual Information with Kernel Density Estimation, which is not only robust against audiovisual changes in scale, but also independent of the language. The experimental results demonstrated that our method could effectively recover audio-to-video synchronization. A preliminary version of this work was reported at the 2008 IAPR Conference on Pattern Recognition (Liu and Sato, 2008) and won the Best Industry Related Paper Award (BIRPA).