Multimodal people ID for a multimedia meeting browser
MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
Mean Shift: A Robust Approach Toward Feature Space Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Surveillance System Combining Peripheral and Foveated Motion Tracking
ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 1 - Volume 1
Tracking Focus of Attention in Meetings
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Face Cataloger: Multi-Scale Imaging for Relating Identity to Location
AVSS '03 Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance
Pointing gesture recognition based on 3D-tracking of face, hands and head orientation
Proceedings of the 5th international conference on Multimodal interfaces
Towards reliable multimodal sensing in aware environments
Proceedings of the 2001 workshop on Perceptive user interfaces
Automatic Analysis of Multimodal Group Actions in Meetings
IEEE Transactions on Pattern Analysis and Machine Intelligence
A GENERIC FACE REPRESENTATION APPROACH FOR LOCAL APPEARANCE BASED FACE VERIFICATION
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops - Volume 03
Multi- and single view multiperson tracking for smart room environments
CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
ISL person identification systems in the CLEAR evaluations
CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
Biometrics Driven Smart Environments: Abstract Framework and Evaluation
UIC '08 Proceedings of the 5th international conference on Ubiquitous Intelligence and Computing
Visual Focus of Attention in Dynamic Meeting Scenarios
MLMI '08 Proceedings of the 5th international workshop on Machine Learning for Multimodal Interaction
ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Detection and localization of 3d audio-visual objects using unsupervised clustering
ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Probabilistic integration of sparse audio-visual cues for identity tracking
MM '08 Proceedings of the 16th ACM international conference on Multimedia
A context-aware virtual secretary in a smart office environment
MM '08 Proceedings of the 16th ACM international conference on Multimedia
Studying vision-based multiple-user interaction with in-home large displays
HCC '08 Proceedings of the 3rd ACM international workshop on Human-centered computing
Multi-modal and multi-camera attention in smart environments
Proceedings of the 2009 international conference on Multimodal interfaces
Blending games, multimedia and reality
MMSys '10 Proceedings of the first annual ACM SIGMM conference on Multimedia systems
International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Multimodal cue detection engine for orchestrated entertainment
MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
A survey on multi person identification and localization
Proceedings of the 5th International Conference on PErvasive Technologies Related to Assistive Environments
Hi-index | 0.00 |
This paper presents a novel system for the automatic and unobtrusive tracking and identification of multiple persons in an indoor environment. Information from several fixed cameras is fused in a particle filter framework to simultaneously track multiple occupants. A set of steerable fuzzy-controlled pan-tilt-zoom cameras serves to smoothly track persons of interest and opportunistically capture facial close-ups for face identification. In parallel, speech segmentation, sound source localization and speaker identification are performed using several far-field microphones and arrays. The information coming asynchronously and sporadically from several sources, such as track updates and spatio-temporally localized visual and acoustic identification cues, is fused at higher level to gradually refine the global scene model and increase the system's confidence in the set of recognized identities. The system has been trained on a small set of users' faces and/or voices and showed good performance in natural meeting scenarios at quickly acquiring their identities and complementing the ID information missing in single modalities.