Attentional Object Spotting by Integrating Multimodal Input

  • Authors:
  • Chen Yu;Dana H. Ballard;Shenghuo Zhu

  • Affiliations:
  • University of Rochester;University of Rochester;University of Rochester

  • Venue:
  • ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

An intelligent human-computer interface is expected to allow computers to work with users in a cooperative manner. To achieve this goal, computers need to be aware of user attention and provide assistances without explicit user request. Cognitive studies of eye movements suggest that in accomplishing well-learned tasks, the performer's focus of attention is locked with the ongoing work and more than 90% of eye movements are closely related to the objects beingmanipulated in the tasks. In light of this, we have developed an attentional object spotting system that integrates multimodal data consisting of eye positions, head positions and video from the "first-person" perspective. To detect the user's focus of attention, we modeled eye gaze and head movements using a hidden Markov model (HMM) representation. For each attentional point in time, the object of user interest is automatically extracted and recognized. We report the results of experiments on finding attentional objects in the natural task of "making a peanut-butter sandwich".