Multirate systems and filter banks
Multirate systems and filter banks
Teaching and learning as multimedia authoring: the classroom 2000 project
MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection
IEEE Transactions on Pattern Analysis and Machine Intelligence
CONDENSATION—Conditional Density Propagation forVisual Tracking
International Journal of Computer Vision
The Aware Home: A Living Laboratory for Ubiquitous Computing Research
CoBuild '99 Proceedings of the Second International Workshop on Cooperative Buildings, Integrating Information, Organization, and Architecture
Tracking Focus of Attention in Meetings
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Towards Vision-Based 3-D People Tracking in a Smart Room
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Simultaneous Tracking of Head Poses in a Panoramic View
ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 3
An improved face recognition technique based on modular PCA approach
Pattern Recognition Letters
Multi-View Head Pose Estimation using Neural Networks
CRV '05 Proceedings of the 2nd Canadian conference on Computer and Robot Vision
A joint particle filter for audio-visual speaker tracking
ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Joint audio-visual tracking using particle filters
EURASIP Journal on Applied Signal Processing
Face recognition by independent component analysis
IEEE Transactions on Neural Networks
Daily Routine Classification from Mobile Phone Data
MLMI '08 Proceedings of the 5th international workshop on Machine Learning for Multimodal Interaction
Face recognition in smart rooms
MLMI'07 Proceedings of the 4th international conference on Machine learning for multimodal interaction
Computer-supported human-human multilingual communication
50 years of artificial intelligence
Hi-index | 0.00 |
In this paper we present our work on audio-visual perception of a lecturer in a smart seminar room, which is equipped with various cameras and microphones. We present a novel approach to track the lecturer based on visual and acoustic observations in a particle filter framework. This approach does not require explicit triangulation of observations in order to estimate the 3D location of the lecturer, thus allowing for fast audio-visual tracking. We also show how automatic recognition of the lecturer's speech from far-field microphones can be improved using his or her tracked location in the room. Based on the tracked location of the lecturer, we can also detect his or her face in the various camera views for further analysis, such as his or her head orientation and identity. The paper describes the overall system and the various components (tracking, speech recognition, head orientation, identification) in detail and presents results on several multimodal recordings of seminars.