Hierarchical model for joint detection and tracking of multi-target

  • Authors:
  • Jianru Xue;Zheng Ma;Nanning Zheng

  • Affiliations:
  • Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, China;Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, China;Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, China

  • Venue:
  • ACCV'09 Proceedings of the 9th Asian conference on Computer Vision - Volume Part II
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a hierarchical and compositional model based on an And-or graph for joint detecting and tracking of multiple targets in video. In the graph, an And-node for the joint state of all targets is decomposed into multiple Or-nodes. Each Or-node represents an individual target's state that includes position, appearance, and scale of the target. Leaf nodes are trained detectors. Measurements that supplied by the predictions of the tracker and leaf nodes are shared among Or-nodes.There are two kinds of production rules respectively designed for the problems of varying number and occlusions. One is association relations that distributes measurements to targets, and the other is semantic relations that represent occlusion between targets. The inference algorithm for the graph consists of three processing channels: (1) a bottom-up channel, which provides informative measurements by using learned detectors; (2) a top-down channel, which estimates the individual target state with joint probabilistic data association; (3) a context sensitive reasoning channel, which finalizes the estimation of the joint state with belief propagation. Additionally, an interaction mechanism between detection and tracking is implemented by a hybrid measurement process. The algorithm is validated widely by tracking peoples in several complex scenarios. Empirical results show that our tracker can reliably track multi-target without any prior knowledge about the number of targets and the targets may appear or disappear anywhere in the image frame and at any time in all these test videos.