The DIRAC AWEAR audio-visual platform for detection of unexpected and incongruent events

Authors:
Jörn Anemüller;Jörg-Hendrik Bach;Barbara Caputo;Michal Havlena;Luo Jie;Hendrik Kayser;Bastian Leibe;Petr Motlicek;Tomas Pajdla;Misha Pavel;Akihiko Torii;Luc Van Gool;Alon Zweig;Hynek Hermansky
Affiliations:
University of Oldenburg, Oldenburg, Germany;University of Oldenburg, Oldenburg, Germany;IDIAP Research Institute, Martigny, Switzerland;Czech Technical University in Prague, Prague, Czech Rep;IDIAP Research Institute, Martigny, Switzerland;University of Oldenburg, Oldenburg, Germany;ETH Zurich, Zurich, Switzerland;IDIAP Research Institute, Martigny, Switzerland;Czech Technical University in Prague, Prague, Czech Rep;Oregon Health & Science University, Portland, OR, USA;Czech Technical University in Prague, Prague, Czech Rep;KU Leuven, Leuven, Belgium;Hebrew University of Jerusalem, Jerusalem, Israel;IDIAP Research Institute, Martigny, Switzerland
Venue:
ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Year:
2008

Citing 4
Cited 0

Data Fusion for Sensory Information Processing Systems

Data Fusion for Sensory Information Processing Systems
Robust Real-Time Face Detection

International Journal of Computer Vision
Structure from Motion with Wide Circular Field of View Cameras

IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust Object Detection with Interleaved Categorization and Segmentation

International Journal of Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is of prime importance in everyday human life to cope with and respond appropriately to events that are not foreseen by prior experience. Machines to a large extent lack the ability to respond appropriately to such inputs. An important class of unexpected events is defined by incongruent combinations of inputs from different modalities and therefore multimodal information provides a crucial cue for the identification of such events, e.g., the sound of a voice is being heard while the person in the field-of-view does not move her lips. In the project DIRAC ("Detection and Identification of Rare Audio-visual Cues") we have been developing algorithmic approaches to the detection of such events, as well as an experimental hardware platform to test it. An audio-visual platform ("AWEAR" - audio-visual wearable device) has been constructed with the goal to help users with disabilities or a high cognitive load to deal with unexpected events. Key hardware components include stereo panoramic vision sensors and 6-channel worn-behind-the-ear (hearing aid) microphone arrays. Data have been recorded to study audio-visual tracking, a/v scene/object classification and a/v detection of incongruencies.