Color-Based Probabilistic Tracking
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part I
EURASIP Journal on Applied Signal Processing
The CAVA corpus: synchronised stereoscopic and binaural datasets with head movements
ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Probability hypothesis density approach for multi-camera multi-object tracking
ACCV'07 Proceedings of the 8th Asian conference on Computer vision - Volume Part I
Robust acoustic source localization with TDOA based RANSAC algorithm
ICIC'09 Proceedings of the 5th international conference on Emerging intelligent computing technology and applications
An audio-video based IVA algorithm for source separation and evaluation on the AV16.3 corpus
LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
Hi-index | 0.00 |
Assessing the quality of a speaker localization or tracking algorithm on a few short examples is difficult, especially when the ground-truth is absent or not well defined. One step towards systematic performance evaluation of such algorithms is to provide time-continuous speaker location annotation over a series of real recordings, covering various test cases. Areas of interest include audio, video and audio-visual speaker localization and tracking. The desired location annotation can be either 2-dimensional (image plane) or 3-dimensional (physical space). This paper motivates and describes a corpus of audio-visual data called “AV16.3”, along with a method for 3-D location annotation based on calibrated cameras. “16.3” stands for 16 microphones and 3 cameras, recorded in a fully synchronized manner, in a meeting room. Part of this corpus has already been successfully used to report research results.