Cascade of descriptors to detect and track objects across any network of cameras

  • Authors:
  • Alexandre Alahi;Pierre Vandergheynst;Michel Bierlaire;Murat Kunt

  • Affiliations:
  • Ecole Polytechnique Federale de Lausanne, Signal Processing Laboratory, CH-1015 Lausanne, Switzerland and Ecole Polytechnique Federale de Lausanne, Transportation and Mobility Laboratory, CH-1015 ...;Ecole Polytechnique Federale de Lausanne, Signal Processing Laboratory, CH-1015 Lausanne, Switzerland;Ecole Polytechnique Federale de Lausanne, Transportation and Mobility Laboratory, CH-1015 Lausanne, Switzerland;Ecole Polytechnique Federale de Lausanne, Signal Processing Laboratory, CH-1015 Lausanne, Switzerland

  • Venue:
  • Computer Vision and Image Understanding
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most multi-camera systems assume a well structured environment to detect and track objects across cameras. Cameras need to be fixed and calibrated, or only objects within a training data can be detected (e.g. pedestrians only). In this work, a master-slave system is presented to detect and track any objects in a network of uncalibrated fixed and mobile cameras. Cameras can have non-overlapping field-of-views. Objects are detected with the mobile cameras (the slaves) given only observations from the fixed cameras (the masters). No training stage and data are used. Detected objects are correctly tracked across cameras leading to a better understanding of the scene. A cascade of grids of region descriptors is proposed to describe any object of interest. To lend insight on the addressed problem, most state-of-the-art region descriptors are evaluated given various schemes. The covariance matrix of various features, the histogram of colors, the histogram of oriented gradients, the scale invariant feature transform (SIFT), the speeded-up robust features (SURF) descriptors, and the color interest points [1] are evaluated. A sparse scan of the cameras'image plane is also presented to reduce the search space of the localization process, approaching nearly real-time performance. The proposed approach outperforms existing works such as scale invariant feature transform (SIFT), or the speeded-up robust features (SURF). The approach is robust to some changes in illumination, viewpoint, color distribution, image quality, and object deformation. Objects with partial occlusion are also detected and tracked.