Face detection and tracking in video sequences using the modifiedcensus transformation
Image and Vision Computing
Hi-index | 0.00 |
In the emerging field of speech-to-speech translation, emphasis is currently placed on the linguistic content, while the significance of paralinguistic information conveyed by facial expression or tone of voice is typically neglected. We present a prototype system for multimodal speech-to-speech translation that is able to automatically recognize and translate spoken utterances from one language into another, with the output rendered by a speech synthesis system. The novelty of our system lies in the technique of generating the synthetic speech output in one of several expressive styles that is automatically determined using a camera to analyze the user's facial expression during speech.