Adaptive multi-modal stereo people tracking without background modelling

  • Authors:
  • Rafael Muñoz-Salinas;Miguel García-Silvente;Rafael Medina Carnicer

  • Affiliations:
  • Department of Computing and Numerical Analysis, University of Córdoba, 14071 Córdoba, Spain;Department of Computer Science and Artificial Intelligence, University of Granada, 18071 Granada, Spain;Department of Computing and Numerical Analysis, University of Córdoba, 14071 Córdoba, Spain

  • Venue:
  • Journal of Visual Communication and Image Representation
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Detecting and tracking persons in the sequences of monocular images are the important and difficult problems in computer vision and have been well studied in these two decades. Recently, the methods based on stereo vision have attracted great attentions since 3D information can be exploited. This paper presents an approach for multiple-people detection and tracking using stereo vision. Tracking is carried out using a multiple particle filtering approach that combines depth, colour and gradient information. We modify the degree of confidence assigned to depth information, according to the amount of it found in the disparity map, using a novel confidence measure. The greater the amount of disparity information found, the higher the degree of confidence assigned to depth information in the final particles weights is. In the worst case (total absence of disparity), the proposed algorithm makes use of the information available (colour and gradient) to track, thus performing as a pure colour-based tracking algorithm. People are detected combining an adaboost classifier with stereo information. In order to test the validity of our proposal, it is evaluated in several sequences of colour and disparity images where people interact in complex situations: walk at different distances, shake hands, cross their paths, jump, run, embrace each other and even swap their positions quickly trying to confuse the system. The experimental results show that the proposal is able to deal with occlusions and to effectively determine both the 3D position of the people being tracked and their 2D head locations in the camera image, and everything is realized in real time. Besides, as the proposed method does not require the use of a background model, it can be considered particularly appropriate for applications that must run on mobile devices.