Perceiving asynchronous bimodal speech in consonant-vowel and vowel syllables
Speech Communication - Special issue: Fujisaki's Festschrift
Design of a virtual auditorium
MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Computational Requirements and Synchronization Issues for Virtual Acoustic Displays
Presence: Teleoperators and Virtual Environments
Human perception of jitter and media synchronization
IEEE Journal on Selected Areas in Communications
BiReality: mutually-immersive telepresence
Proceedings of the 12th annual ACM international conference on Multimedia
The FishbowlTM: degrees of engagement in global teamwork
EG-ICE'06 Proceedings of the 13th international conference on Intelligent Computing in Engineering and Architecture
Computer Networks: The International Journal of Computer and Telecommunications Networking
Hi-index | 0.01 |
Audio is presented ahead of video in some videoconferencing systems since audio requires less time to process. Audio could be delayed to synchronize with video to achieve lip synchronization; however, the overall audio latency might then become unacceptable. We built a videoconferencing system to achieve lip synchronization with minimal perceived audio latency. Instead of adding a fixed audio delay, our system time-stretches the audio at the beginning of each utterance until the audio is synchronized with the video. We conducted user studies and found that (1) audio could lead video by roughly 50 msec and still be perceived as synchronized; (2) audio could lead video by 300 msec and still be perceived as synchronized if the audio was time-stretched to synchronization within a short period; and (3) our algorithm appears to strike a favorable balance between minimizing audio latency and supporting lip synchronization.