Where is "it"? Event Synchronization in Gaze-Speech Input Systems

Authors:
Manpreet Kaur;Marilyn Tremaine;Ning Huang;Joseph Wilder;Zoran Gacovski;Frans Flippo;Chandra Sekhar Mantravadi
Affiliations:
Rutgers University, Piscataway, NJ;New Jersey Institute of Technology, Newark, NJ;Rutgers University, Piscataway, NJ;Rutgers University, Piscataway, NJ;Rutgers University, Piscataway, NJ;Rutgers University, Piscataway, NJ;Rutgers University, Piscataway, NJ
Venue:
Proceedings of the 5th international conference on Multimodal interfaces
Year:
2003

Citing 17
Cited 11

A gaze-responsive self-disclosing display

CHI '90 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
The use of eye movements in human-computer interaction techniques: what you look at is what you get

ACM Transactions on Information Systems (TOIS) - Special issue on computer—human interaction
Noncommand user interfaces

Communications of the ACM - Special issue on graphical user interfaces
Integrating simultaneous input from speech, gaze, and hand gestures

Intelligent multimedia interfaces
New technological windows into mind: there is more in eyes and brains for human-computer interaction

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Integration and synchronization of input modes during multimodal human-computer interaction

Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
An evaluation of an eye tracker as a device for computer input2

CHI '87 Proceedings of the SIGCHI/GI Conference on Human Factors in Computing Systems and Graphics Interface
Manual and gaze input cascaded (MAGIC) pointing

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Inferring intent in eye-based interfaces: tracing eye movements with process models

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Intelligent gaze-added interfaces

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Evaluation of eye gaze interaction

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Extended tasks elicit complex eye movement patterns

ETRA '00 Proceedings of the 2000 symposium on Eye tracking research & applications
Identifying fixations and saccades in eye-tracking protocols

ETRA '00 Proceedings of the 2000 symposium on Eye tracking research & applications
Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Numerical Recipes in C: The Art of Scientific Computing

Numerical Recipes in C: The Art of Scientific Computing
Principles and guidelines for the design of eye/voice interaction dialogs

HICS '96 Proceedings of the 3rd Symposium on Human Interaction with Complex Systems (HICS '96)
Designing the user interface for multimodal speech and pen-based gesture applications: state-of-the-art systems and future research directions

Human-Computer Interaction

Human-robot speech interface understanding inexplicit utterances using vision

CHI '04 Extended Abstracts on Human Factors in Computing Systems
Implementation and evaluation of a constraint-based multimodal fusion system for speech and 3D pointing gestures

Proceedings of the 6th international conference on Multimodal interfaces
Conversing with the user based on eye-gaze patterns

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Working with robots and objects: revisiting deictic reference for achieving spatial common ground

Proceedings of the 1st ACM SIGCHI/SIGART conference on Human-robot interaction
Looking my way through the menu: the impact of menu design and multimodal input on gaze-based menu selection

Proceedings of the 2008 symposium on Eye tracking research & applications
What's in a gaze?: the role of eye-gaze in reference resolution in multimodal conversational interfaces

Proceedings of the 13th international conference on Intelligent user interfaces
Gaze, conversational agents and face-to-face communication

Speech Communication
Context-based word acquisition for situated dialogue in a virtual world

Journal of Artificial Intelligence Research
Multimodal, touchless interaction in spatial augmented reality environments

ICDHM'11 Proceedings of the Third international conference on Digital human modeling
RealTourist: a study of augmenting human-human and human-computer dialogue with eye-gaze overlay

INTERACT'05 Proceedings of the 2005 IFIP TC13 international conference on Human-Computer Interaction
Move it there, or not?: the design of voice commands for gaze with speech

Proceedings of the 4th Workshop on Eye Gaze in Intelligent Human Machine Interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

The relationship between gaze and speech is explored for the simple task of moving an object from one location to another on a computer screen. The subject moves a designated object from a group of objects to a new location on the screen by stating, "Move it there". Gaze and speech data are captured to determine if we can robustly predict the selected object and destination position. We have found that the source fixation closest to the desired object begins, with high probability, before the beginning of the word "Move". An analysis of all fixations before and after speech onset time shows that the fixation that best identifies the object to be moved occurs, on average, 630 milliseconds before speech onset with a range of 150 to 1200 milliseconds for individual subjects. The variance in these times for individuals is relatively small although the variance across subjects is large. Selecting a fixation closest to the onset of the word "Move" as the designator of the object to be moved gives a system accuracy close to 95% for all subjects. Thus, although significant differences exist between subjects, we believe that the speech and gaze integration patterns can be modeled reliably for individual users and therefore be used to improve the performance of multimodal systems.