Multimodal collaboration and human-computer interaction

Authors:
Zhengyou Zhang
Affiliations:
Microsoft Research, Redmond, WA
Venue:
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Year:
2009

Citing 3
Cited 0

Distributed meetings: a meeting capture and broadcasting system

Proceedings of the tenth ACM international conference on Multimedia
Robust and Accurate Visual Echo Cancelation in a Full-duplex Projector-Camera System

IEEE Transactions on Pattern Analysis and Machine Intelligence
Boosting-Based Multimodal Speaker Detection for Distributed Meeting Videos

IEEE Transactions on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

The research effort at Microsoft Research on multimodal collaboration and human-computer interaction aims at developing tools that allow people across geographically distributed sites to interact collaboratively with immersive experience. Our prototype systems consist of cameras, displays, speakers, microphones, computer controllable lights, and/or input devices such as touch sensitive surface, stylus, keyboard, and mouse. They require real-time processing a huge amount of data, such as foreground-background substraction, region-of-interest extraction, color estimation and correction, speaker detection, stereo matching, 3D reconstruction and rendering, without mentioning audio and video encoding and decoding possibly involving multiple microphones and cameras. Some of the processing can be easily parallelizable through general-purpose computation on graphics processing units (GPGPU) or on a multi-core processor machine, while others are not so trivial. In this extended summary, I describe two projects: Visual Echo Cancellation in Shared Tele-collaborative Space, and Distributed Meeting Capture and Broadcasting System. During the talk, I will also present two recent projects: Personal Telepresence Station and Situated Interaction.