Just-in-time multimodal association and fusion from home entertainment

Authors:
Danil Korchagin;Petr Motlicek;Stefan Duffner;Herve Bourlard
Affiliations:
Idiap Res. Inst., Martigny, Switzerland;Idiap Res. Inst., Martigny, Switzerland;Idiap Res. Inst., Martigny, Switzerland;Idiap Res. Inst., Martigny, Switzerland
Venue:
ICME '11 Proceedings of the 2011 IEEE International Conference on Multimedia and Expo
Year:
2011

Citing 0
Cited 4

Multimodal cue detection engine for orchestrated entertainment

MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Automatic orchestration of video streams to enhance group communication

Proceedings of the 2012 international workshop on Socially-aware multimedia
Enabling 'togetherness' in high-quality domestic video

Proceedings of the 20th ACM international conference on Multimedia
Real-time audio-visual analysis for multiperson videoconferencing

Advances in Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we describe a real-time multimodal analysis system with just-in-time multimodal association and fusion for a living room environment, where multiple people may enter, interact and leave the observable world with no constraints. It comprises detection and tracking of up to 4 faces, detection and localisation of verbal and paralinguistic events, their association and fusion. The system is designed to be used in open, unconstrained environments like in next generation video conferencing systems that automatically "orchestrate" the transmitted video streams to improve the overall experience of interaction between spatially separated families and friends. Performance levels achieved to date on hand-labelled dataset have shown sufficient reliability at the same time as fulfilling real-time processing requirements.