Attentional Object Spotting by Integrating Multimodal Input

Authors:
Chen Yu;Dana H. Ballard;Shenghuo Zhu
Affiliations:
University of Rochester;University of Rochester;University of Rochester
Venue:
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Year:
2002

Citing 11
Cited 1

Color indexing

International Journal of Computer Vision
Eye tracking in advanced interface design

Virtual environments and advanced interface design
SEEMORE: combining color, shape, and texture histogramming in a neurally inspired approach to visual object recognition

Neural Computation
Evaluation of eye gaze interaction

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Finding generalized projected clusters in high dimensional spaces

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Recognition without Correspondence using MultidimensionalReceptive Field Histograms

International Journal of Computer Vision
Design issues of iDICT: a gaze-assisted translation aid

ETRA '00 Proceedings of the 2000 symposium on Eye tracking research & applications
Identifying fixations and saccades in eye-tracking protocols

ETRA '00 Proceedings of the 2000 symposium on Eye tracking research & applications
Seeded Region Growing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning to Recognize Human Action Sequences

ICDL '02 Proceedings of the 2nd International Conference on Development and Learning
Estimating focus of attention based on gaze and sound

Proceedings of the 2001 workshop on Perceptive user interfaces

A multimodal learning interface for grounding spoken language in sensory perceptions

ACM Transactions on Applied Perception (TAP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

An intelligent human-computer interface is expected to allow computers to work with users in a cooperative manner. To achieve this goal, computers need to be aware of user attention and provide assistances without explicit user request. Cognitive studies of eye movements suggest that in accomplishing well-learned tasks, the performer's focus of attention is locked with the ongoing work and more than 90% of eye movements are closely related to the objects beingmanipulated in the tasks. In light of this, we have developed an attentional object spotting system that integrates multimodal data consisting of eye positions, head positions and video from the "first-person" perspective. To detect the user's focus of attention, we modeled eye gaze and head movements using a hidden Markov model (HMM) representation. For each attentional point in time, the object of user interest is automatically extracted and recognized. We report the results of experiments on finding attentional objects in the natural task of "making a peanut-butter sandwich".