Probabilistic integration of sparse audio-visual cues for identity tracking

  • Authors:
  • Keni Bernardin;Rainer Stiefelhagen;Alex Waibel

  • Affiliations:
  • Universität Karlsruhe, Karlsruhe, Germany;Universität Karlsruhe, Karlsruhe, Germany;Universität Karlsruhe, Karlsruhe, Germany

  • Venue:
  • MM '08 Proceedings of the 16th ACM international conference on Multimedia
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the context of smart environments, the ability to track and identify persons is a key factor, determining the scope and flexibility of analytical components or intelligent services that can be provided. While some amount of work has been done concerning the camera-based tracking of multiple users in a variety of scenarios, technologies for acoustic and visual identification, such as face or voice ID, are unfortunately still subjected to severe limitations when distantly placed sensors have to be used. Because of this, reliable cues for identification can be hard to obtain without user cooperation, especially when multiple users are involved. In this paper, we present a novel technique for the tracking and identification of multiple persons in a smart environment using distantly placed audio-visual sensors. The technique builds on the opportunistic integration of tracking as well as face and voice identification cues, gained from several cameras and microphones, whenever these cues can be captured with a sufficient degree of confidence. A probabilistic model is used to keep track of identified persons and update the belief in their identities whenever new observations can be made. The technique has been systematically evaluated on the CLEAR Interactive Seminar database, a large audio-visual corpus of realistic meeting scenarios captured in a variety of smart rooms.