Multimodal people detection and tracking in crowded scenes

  • Authors:
  • Luciano Spinello;Rudolph Triebel;Roland Siegwart

  • Affiliations:
  • Autonomous Systems Lab, ETH Zurich, Zurich, Switzerland;Autonomous Systems Lab, ETH Zurich, Zurich, Switzerland;Autonomous Systems Lab, ETH Zurich, Zurich, Switzerland

  • Venue:
  • AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a novel people detection and tracking method based on a multi-modal sensor fusion approach that utilizes 2D laser range and camera data. The data points in the laser scans are clustered using a novel graph-based method and an SVM based version of the cascaded AdaBoost classifier is trained with a set of geometrical features of these clusters. In the detection phase, the classified laser data is projected into the camera image to define a region of interest for the vision-based people detector. This detector is a fast version of the Implicit Shape Model (ISM) that learns an appearance codebook of local SIFT descriptors from a set of hand-labeled images of pedestrians and uses them in a voting scheme to vote for centers of detected people. The extension consists in a fast and detailed analysis of the spatial distribution of voters per detected person. Each detected person is tracked using a greedy data association method and multiple Extended Kalman Filters that use different motion models. This way, the filter can cope with a variety of different motion patterns. The tracker is asynchronously updated by the detections from the laser and the camera data. Experiments conducted in real-world outdoor scenarios with crowds of pedestrians demonstrate the usefulness of our approach.