Audio-based head motion synthesis for Avatar-based telepresence systems
Proceedings of the 2004 ACM SIGMM workshop on Effective telepresence
Transferable videorealistic speech animation
Proceedings of the 2005 ACM SIGGRAPH/Eurographics symposium on Computer animation
Evaluating users' reactions to human-like interfaces
From brows to trust
Expressive Facial Animation Synthesis by Learning Speech Coarticulation and Expression Spaces
IEEE Transactions on Visualization and Computer Graphics
Facial Expression Synthesis Using PAD Emotional Parameters for a Chinese Expressive Avatar
ACII '07 Proceedings of the 2nd international conference on Affective Computing and Intelligent Interaction
Combining Empirical Studies of Audio-Lingual and Visual-Facial Modalities for Emotion Recognition
KES '07 Knowledge-Based Intelligent Information and Engineering Systems and the XVII Italian Workshop on Neural Networks on Proceedings of the 11th International Conference
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Towards Facial Gestures Generation by Speech Signal Analysis Using HUGE Architecture
Multimodal Signals: Cognitive and Algorithmic Issues
Designing a multi-modal affective knowledge-based user interface: combining empirical studies
Proceedings of the 2008 conference on Knowledge-Based Software Engineering: Proceedings of the Eighth Joint Conference on Knowledge-Based Software Engineering
Associating facial displays with syntactic constituents for generation
LAW '07 Proceedings of the Linguistic Annotation Workshop
On the importance of audiovisual coherence for the perceived quality of synthesized visual speech
EURASIP Journal on Audio, Speech, and Music Processing - Special issue on animating virtual speakers or singers from audio: Lip-synching facial animation
IEEE Transactions on Audio, Speech, and Language Processing
Head motions during dialogue speech and nod timing control in humanoid robots
Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction
On assisting a visual-facial affect recognition system with keyboard-stroke pattern information
Knowledge-Based Systems
Visual affect recognition
On the importance of eye gaze in a face-to-face collaborative task
Proceedings of the 3rd international workshop on Affective interaction in natural environments
Mood avatar: automatic text-driven head motion synthesis
International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Speech, gaze and head motion in a face-to-face collaborative task
Proceedings of the Third COST 2102 international training school conference on Toward autonomous, adaptive, and context-aware multimodal interfaces: theoretical and practical issues
On creating multimodal virtual humans--real time speech driven facial gesturing
Multimedia Tools and Applications
[HUGE]: universal architecture for statistically based HUman GEsturing
IVA'06 Proceedings of the 6th international conference on Intelligent Virtual Agents
Inferring competitive role patterns in reality TV show through nonverbal analysis
Multimedia Tools and Applications
Intelligent content production for a virtual speaker
IMTCI'04 Proceedings of the Second international conference on Intelligent Media Technology for Communicative Intelligence
Generation of nodding, head tilting and eye gazing for human-robot dialogue interaction
HRI '12 Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction
SLPAT '12 Proceedings of the Third Workshop on Speech and Language Processing for Assistive Technologies
Hi-index | 0.00 |
As we articulate speech, we usually move the head and exhibit various facial expressions. This visual aspect of speech aids understanding and helps communicating additional information, such as the speaker's mood. In this paper we analyze quantitatively head and facial movements that accompany speech and investigate how they relate to the text's prosodic structure.We recorded several hours of speech and measured the locations of the speakers' main facial features as well as their head poses. The text was evaluated with a prosody prediction tool, identifying phrase boundaries and pitch accents. Characteristic for most speakers are simple motion patterns that are repeatedly applied in synchrony with the main prosodic events. Direction and strength of head movements vary widely from one speaker to another, yet their timing is typically well synchronized with the spoken text.Understanding quantitatively the correlations between head movements and spoken text is important for synthesizing photo-realistic talking heads. Talking heads appear much more engaging when they exhibit realistic motion patterns.