Multi-party human-machine interaction using a smart multimodal digital signage

Authors:
Tony Tung;Randy Gomez;Tatsuya Kawahara;Takashi Matsuyama
Affiliations:
Academic Center for Computing and Media Studies and Graduate School of Informatics, Kyoto University, Japan;Academic Center for Computing and Media Studies and Graduate School of Informatics, Kyoto University, Japan;Academic Center for Computing and Media Studies and Graduate School of Informatics, Kyoto University, Japan;Academic Center for Computing and Media Studies and Graduate School of Informatics, Kyoto University, Japan
Venue:
HCI'13 Proceedings of the 15th international conference on Human-Computer Interaction: interaction modalities and techniques - Volume Part IV
Year:
2013

Citing 10
Cited 0

Active Appearance Models

ECCV '98 Proceedings of the 5th European Conference on Computer Vision-Volume II - Volume II
Robust Real-Time Face Detection

International Journal of Computer Vision
User-oriented document summarization through vision-based eye-tracking

Proceedings of the 14th international conference on Intelligent user interfaces
Robust speech recognition based on dereverberation parameter optimization using acoustic model likelihood

IEEE Transactions on Audio, Speech, and Language Processing - Special issue on processing reverberant speech: methodologies and applications
Analysis environment of conversational structure with nonverbal multimodal data

International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Real time head pose estimation from consumer depth cameras

DAGM'11 Proceedings of the 33rd international conference on Pattern recognition
VACE multimodal meeting corpus

MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Real-time human pose recognition in parts from single depth images

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Topology Dictionary for 3D Video Understanding

IEEE Transactions on Pattern Analysis and Machine Intelligence
Group dynamics and multimodal interaction modeling using a smart digital signage

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a novel multimodal system designed for smooth multi-party human-machine interaction. HCI for multiple users is challenging because simultaneous actions and reactions have to be consistent. Here, the proposed system consists of a digital signage or large display equipped with multiple sensing devices: a 19-channel microphone array, 6 HD video cameras (3 are placed on top and 3 on the bottom of the display), and two depth sensors. The display can show various contents, similar to a poster presentation, or multiple windows (e.g., web browsers, photos, etc.). On the other hand, multiple users positioned in front of the panel can freely interact using voice or gesture while looking at the displayed contents, without wearing any particular device (such as motion capture sensors or head mounted devices). Acoustic and visual information processing are performed jointly using state-of-the-art techniques to obtain individual speech and gaze direction. Hence displayed contents can be adapted to users' interests.