FG '00 Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000
Tracking Focus of Attention in Meetings
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Context-based vision system for place and object recognition
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Mutual disambiguation of 3D multimodal interaction in augmented and virtual reality
Proceedings of the 5th international conference on Multimodal interfaces
A multi-modal approach for determining speaker location and focus
Proceedings of the 5th international conference on Multimodal interfaces
A real-time head nod and shake detector
Proceedings of the 2001 workshop on Perceptive user interfaces
Impact of video editing based on participants' gaze in multiparty conversation
CHI '04 Extended Abstracts on Human Factors in Computing Systems
A shallow model of backchannel continuers in spoken dialogue
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Contextual recognition of head gestures
ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Learning a model of speaker head nods using gesture corpora
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
A spoken dialog system for chat-like conversations considering response timing
TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Conditional sequence model for context-based recognition of gaze aversion
MLMI'07 Proceedings of the 4th international conference on Machine learning for multimodal interaction
Use of context in vision processing: an introduction to the UCVP 2009 workshop
Proceedings of the Workshop on Use of Context in Vision Processing
Journal on Image and Video Processing - Special issue on advanced video-based surveillance
Local response context applied to pedestrian detection
CIARP'11 Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Hi-index | 0.00 |
Head pose and gesture offer several conversational grounding cues and are used extensively in face-to-face interaction among people. To accurately recognize visual feedback, humans often use contextual knowledge from previous and current events to anticipate when feedback is most likely to occur. In this paper we describe how contextual information from other participants can be used to predict visual feedback and improve recognition of head gestures in multiparty interactions (e.g., meetings). An important contribution of this paper is our data-driven representation, called co-occurrence graphs, which models co-occurrence between contextual cues such as spoken words and pauses, and visual head gestures. By analyzing these co-occurrence patterns we can automatically select relevant contextual features and predict when visual gestures are more likely. Using a discriminative approach to multi-modal integration, our contextual representation using co-occurrence graph improves head gesture recognition performance on a publicly available dataset of multi-party interactions.