“GazeToTalk”: a nonverbal interface with meta-communication facility (Poster Session)

Authors:
Tetsuro Chino;Kazuhiro Fukui;Kaoru Suzuki
Affiliations:
Toshiba Corporate Research and Development Center, 1, Komukai Toshiba-Cho, Saiwai-Ku, Kawasaki, 212-8582, Japan;Toshiba Corporate Research and Development Center, 1, Komukai Toshiba-Cho, Saiwai-Ku, Kawasaki, 212-8582, Japan;Toshiba Corporate Research and Development Center, 1, Komukai Toshiba-Cho, Saiwai-Ku, Kawasaki, 212-8582, Japan
Venue:
ETRA '00 Proceedings of the 2000 symposium on Eye tracking research & applications
Year:
2000

Citing 0
Cited 1

VTQuest: a voice-based multimodal web-based software system for maps and directions

Proceedings of the 44th annual Southeast regional conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a new human interface (HI) system named “GazeToTalk” that is implemented by vision based gaze detection, acoustic speech recognition (ASR), and animated human-like agent CG with facial expressions and gestures. The “GazeToTalk” system demonstrates that eye-tracking technologies can be utilized to improve HI effectively by working with other non-verbal messages such as facial expressions and gestures.Conventional voice interface system have the following serious drawbacks. (1) They cannot distinct between input voice and other noise, and (2) cannot understand who is the intended hearer of each utterance. A “push-to-Wk” mechanism can be used to ease these problems, but it spoils the advantages of voice interfaces (e.g. contact-less, suitability in hand-busy situation).In real human dialogues, besides exchanging content messages, people use non-verbal messages such as gaze, facial expressions and gestures to establish or maintain conversations, or recover from problems that arise in the conversation.The “GazeToTalk” system simulates this kind of “meta-communication” facility by utilizing vision based gaze detection, ASR, and human-like agent CG. When the user intends to input voice commands, he gazes on the agent on the display in order to request to talk, just as in daily human-human dialogues. This gaze is recognized by the gaze detection module and the agent shows a particular facial expression and gestures as a feedback to establish an “eye-contact.” Then the system accepts or rejects speech input from the user depending on the state of the “eye-contact.”This mechanism allows the “GazeToTalk” system to accept only intended voice input and ignore another voices and environmental noises successfully, without forcing any arbitrary operation to the user. We also demonstrate an extended mechanism to treat more flexible “eye contact” variations.The preliminary experiments suggest that in the context of meta-communication, nonverbal messages can be utilized to improve HI in terms of naturalness, friendliness and tactfulness.