Multi-modal and multi-camera attention in smart environments

Authors:
Boris Schauerte;Jan Richarz;Thomas Plötz;Christian Thurau;Gernot A. Fink
Affiliations:
TU Dortmund, Dortmund, Germany;TU Dortmund, Dortmund, Germany;TU Dortmund, Dortmund, Germany;Fraunhofer IAIS, Sankt Augustin, Germany;TU Dortmund, Dortmund, Germany
Venue:
Proceedings of the 2009 international conference on Multimodal interfaces
Year:
2009

Citing 7
Cited 3

On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Audio-visual multi-person tracking and identification for smart environments

Proceedings of the 15th international conference on Multimedia
Retargeting Images and Video for Preserving Information Saliency

IEEE Computer Graphics and Applications
A General Method for Sensor Planning in Multi-Sensor Systems: Extension to Random Occlusion

International Journal of Computer Vision
A Multi-modal Attention System for Smart Environments

ICVS '09 Proceedings of the 7th International Conference on Computer Vision Systems: Computer Vision Systems
Active-vision-based multisensor surveillance - an implementation

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Computational visual attention systems and their cognitive foundations: A survey

ACM Transactions on Applied Perception (TAP)
A Multi-modal Attention System for Smart Environments

ICVS '09 Proceedings of the 7th International Conference on Computer Vision Systems: Computer Vision Systems
Feature representations for the recognition of 3D emblematic gestures

HBU'10 Proceedings of the First international conference on Human behavior understanding

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper considers the problem of multi-modal saliency and attention. Saliency is a cue that is often used for directing attention of a computer vision system, e.g., in smart environments or for robots. Unlike the majority of recent publications on visual/audio saliency, we aim at a well grounded integration of several modalities. The proposed framework is based on fuzzy aggregations and offers a flexible, plausible, and efficient way for combining multi-modal saliency information. Besides incorporating different modalities, we extend classical 2D saliency maps to multi-camera and multi-modal 3D saliency spaces. For experimental validation we realized the proposed system within a smart environment. The evaluation took place for a demanding setup under real-life conditions, including focus of attention selection for multiple subjects and concurrently active modalities.