Spatial-temporal consistent labeling of tracked pedestrians across non-overlapping camera views

  • Authors:
  • Guoyun Lian;Jianhuang Lai;Wei-Shi Zheng

  • Affiliations:
  • School of Information Science and Technology, Sun Yat-sen University, 510006, China;School of Information Science and Technology, Sun Yat-sen University, 510006, China;School of Electronic Engineering and Computer Science, Queen Mary University of London, London E1 4NS, UK and Guangdong Province Key Laboratory of Information Security, Sun Yat-sen University, Chi ...

  • Venue:
  • Pattern Recognition
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

Tracking people across multiple cameras with non-overlapping views is a challenging task, since their observations are separated in time and space and their appearances may vary significantly. This paper proposes a Bayesian model to solve the consistent labeling problem across multiple non-overlapping camera views. Significantly different from related approaches, our model assumes neither people are well segmented nor their trajectories across camera views are estimated. We formulate a spatial-temporal probabilistic model in the hypothesis space that consists the potentially matched objects between the exit field of view (FOV) of one camera and the entry FOV of another camera. A competitive major color spectrum histogram representation (CMCSHR) for appearance matching between two objects is also proposed. The proposed spatial-temporal and appearance models are unified by a maximum-a-posteriori (MAP) Bayesian model. Based on this Bayesian model, when a detected new object corresponds to a group hypothesis (more than one object), we further develop an online method for online correspondence update using optimal graph matching (OGM) algorithm. Experimental results on three different real scenarios validate the proposed Bayesian model approach and the CMCSHR method. The results also show that the proposed approach is able to address the occlusion problem/group problem, i.e. finding the corresponding individuals in another camera view for a group of people who walk together into the entry FOV of a camera.