Co-occurrence graphs: contextual representation for head gesture recognition during multi-party interactions

Authors:
Louis-Philippe Morency
Affiliations:
University of Southern California, Marina del Rey, CA
Venue:
Proceedings of the Workshop on Use of Context in Vision Processing
Year:
2009

Citing 13
Cited 3

Real-Time Detection of Nodding and Head-Shaking by Directly Detecting and Tracking the "Between-Eyes"

FG '00 Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000
Tracking Focus of Attention in Meetings

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Context-based vision system for place and object recognition

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Mutual disambiguation of 3D multimodal interaction in augmented and virtual reality

Proceedings of the 5th international conference on Multimodal interfaces
A multi-modal approach for determining speaker location and focus

Proceedings of the 5th international conference on Multimodal interfaces
A real-time head nod and shake detector

Proceedings of the 2001 workshop on Perceptive user interfaces
Impact of video editing based on participants' gaze in multiparty conversation

CHI '04 Extended Abstracts on Human Factors in Computing Systems
A shallow model of backchannel continuers in spoken dialogue

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Contextual recognition of head gestures

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Context-based recognition during human interactions: automatic feature selection and encoding dictionary

ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Learning a model of speaker head nods using gesture corpora

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
A spoken dialog system for chat-like conversations considering response timing

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Conditional sequence model for context-based recognition of gaze aversion

MLMI'07 Proceedings of the 4th international conference on Machine learning for multimodal interaction

Use of context in vision processing: an introduction to the UCVP 2009 workshop

Proceedings of the Workshop on Use of Context in Vision Processing
Contextual information and covariance descriptors for people surveillance: an application for safety of construction workers

Journal on Image and Video Processing - Special issue on advanced video-based surveillance
Local response context applied to pedestrian detection

CIARP'11 Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Head pose and gesture offer several conversational grounding cues and are used extensively in face-to-face interaction among people. To accurately recognize visual feedback, humans often use contextual knowledge from previous and current events to anticipate when feedback is most likely to occur. In this paper we describe how contextual information from other participants can be used to predict visual feedback and improve recognition of head gestures in multiparty interactions (e.g., meetings). An important contribution of this paper is our data-driven representation, called co-occurrence graphs, which models co-occurrence between contextual cues such as spoken words and pauses, and visual head gestures. By analyzing these co-occurrence patterns we can automatically select relevant contextual features and predict when visual gestures are more likely. Using a discriminative approach to multi-modal integration, our contextual representation using co-occurrence graph improves head gesture recognition performance on a publicly available dataset of multi-party interactions.