Cost-Effective Solution to Synchronized Audio-Visual Capture Using Multiple Sensors

Authors:
Jeroen Lichtenauer;Michel Valstar;Jie Shen;Maja Pantic
Affiliations:
-;-;-;-
Venue:
AVSS '09 Proceedings of the 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance
Year:
2009

Citing 7
Cited 3

ViRoom - Low Cost Synchronized Multicamera System and Its Self-calibration

Proceedings of the 24th DAGM Symposium on Pattern Recognition
Universal synchronization scheme for distributed audio-video capture on heterogeneous computing platforms

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
High-quality video view interpolation using a layered representation

ACM SIGGRAPH 2004 Papers
High performance imaging using large camera arrays

ACM SIGGRAPH 2005 Papers
AHA: An Easily Extendible High-Resolution Camera Array

DMAMH '07 Proceedings of the Second Workshop on Digital Media and its Application in Museum & Heritage
A flexible client-driven 3DTV systemfor real-time aquisition, transmission, and display of dynamic scenes

EURASIP Journal on Applied Signal Processing - 3DTV: Capture, Transmission, and Display of 3D Video
Audiovisual Synchronization and Fusion Using Canonical Correlation Analysis

IEEE Transactions on Multimedia

A multimodal database for mimicry analysis

ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part I
AVEC 2011-the first international audio/visual emotion challenge

ACII'11 Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II
AVEC 2012: the continuous audio/visual emotion challenge

Proceedings of the 14th ACM international conference on Multimodal interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

Applications such as surveillance and human motion capture require high-bandwidth recording from multiple cameras. Furthermore, the recent increase in research on sensor fusion has raised the demand on synchronization accuracy between video, audio and other sensor modalities. Previously, capturing synchronized, high resolution video from multiple cameras required complex, inflexible and expensive solutions. Our experiments show that a single PC, built from contemporary low-cost computer hardware, could currently handle up to 470MB/s of input data. This allows capturing from 18 cameras of 780x580pixels at 60fps each, or 36 cameras at 30fps. Furthermore, we achieve accurate synchronization between audio, video and additional sensors, by recording audio together with sensor trigger- or timestamp signals, using a multi-channel audio input. In this way, each sensor modality can be captured with separate software and hardware, allowing maximal flexibility with minimal cost.