A Fast and Accurate Face Detector Based on Neural Networks
IEEE Transactions on Pattern Analysis and Machine Intelligence
Detection and Estimation of Pointing Gestures in Dense Disparity Maps
FG '00 Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000
“Put-that-there”: Voice and gesture at the graphics interface
SIGGRAPH '80 Proceedings of the 7th annual conference on Computer graphics and interactive techniques
VizWear-Active: Distributed Monte Carlo Face Tracking for Wearable Active Cameras
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 1 - Volume 1
3-D Articulated Pose Tracking for Untethered Diectic Reference
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Mutual disambiguation of 3D multimodal interaction in augmented and virtual reality
Proceedings of the 5th international conference on Multimodal interfaces
Interactive skills using active gaze tracking
Proceedings of the 5th international conference on Multimodal interfaces
Robust Real-Time Face Detection
International Journal of Computer Vision
Arm-Pointing Gesture Interface Using Surrounded Stereo Cameras System
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 4 - Volume 04
Convolutional Face Finder: A Neural Architecture for Fast and Robust Face Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence
Inferring body pose using speech content
ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
A study of manual gesture-based selection for the PEMMI multimodal transport management interface
ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
3D-tracking of head and hands for pointing gesture recognition in a human-robot interaction scenario
FGR' 04 Proceedings of the Sixth IEEE international conference on Automatic face and gesture recognition
Tracking body parts of multiple people for multi-person multimodal interface
ICCV'05 Proceedings of the 2005 international conference on Computer Vision in Human-Computer Interaction
Testing the performance of spoken dialogue systems by means of an artificially simulated user
Artificial Intelligence Review
HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces
HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces
Evaluation of contactless multimodal pointing devices
IASTED-HCI '07 Proceedings of the Second IASTED International Conference on Human Computer Interaction
What gestures to perform a collaborative storytelling?
ICVS'07 Proceedings of the 4th international conference on Virtual storytelling: using virtual reality technologies for storytelling
VIRSTORY: a collaborative virtual storytelling
ICEC'06 Proceedings of the 5th international conference on Entertainment Computing
Hi-index | 0.00 |
This paper describes a Wizard of Oz cooperative story telling experiment named Virstory, where user speech-gesture actions are interpreted in order to cooperatively build a story with another person, partner of the interpreter. The gesture, speech and multimodal behaviours of 20 subjects are detailed. The multimodal oral with gesture large display interface (MOWGLI) is then described. It is an oral and gesture multimodal human-computer interface, allowing users interacting remotely in real time. Continuous pointing direction and other hand discrete selection gestures are recognized by computer vision tracking of user's head and hands. Associating gesture recognition with speech recognition of selection and deselection oral commands, MOWGLI behaves as a virtual contactless, application independent, multimodal mouse. Discrete pointing locations corresponding to discrete speech or gesture selection time events are extracted from the continuous pointing process. A large vocabulary related to a chess game application allows shorter and specific multimodal commands such as pointing at desired location 〈there〉 and uttering a piece move oral command without needing a previous pointing gesture to another piece location, whereas generic "Put that there" commands need two successive pointing locations (〈that〉 and 〈there〉). Contextual constraints such as displacement rules of pieces and current game position allow interpretation of ambiguous commands and lead to shorter multimodal commands.