Coupling-and-Decoupling: a hierarchical model for occlusion-free car detection

  • Authors:
  • Bo Li;Tianfu Wu;Wenze Hu;Mingtao Pei

  • Affiliations:
  • Beijing Lab of Intelligent Inf., School of Computer Science and Techn., Beijing Inst. of Techn., Beijing, P.R.China,BUPT-Seesoft Joint Lab of Visual Computing and Image Comm., Beijing Univ. of Pos ...;BUPT-Seesoft Joint Lab of Visual Computing and Image Communication, Beijing University of Posts and Telecommunications (BUPT), Beijing, P.R.China,Lotus Hill Research Institute, Ezhou, P.R.China;Lotus Hill Research Institute, Ezhou, P.R.China,Department of Statistics, University of California, Los Angeles;Beijing Lab of Intelligent Information, School of Computer Science and Technology, Beijing Institute of Technology, Beijing, P.R.China

  • Venue:
  • ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part I
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Handling occlusions in object detection is a long-standing problem. This paper addresses the problem of X-to-X-occlusion-free object detection (e.g. car-to-car occlusions in our experiment) by utilizing an intuitive coupling-and-decoupling strategy. In the "coupling" stage, we model the pair of occluding X's (e.g. car pairs) directly to account for the statistically strong co-occurrence (i.e. coupling). Then, we learn a hierarchical And-Or directed acyclic graph (AOG) model under the latent structural SVM (LSSVM) framework. The learned AOG consists of, from the top to bottom, (i) a root Or-node representing different compositions of occluding X pairs, (ii) a set of And-nodes each of which represents a specific composition of occluding X pairs, (iii) another set of And-nodes representing single X's decomposed from occluding X pairs, and (iv) a set of terminal-nodes which represent the appearance templates for the X pairs, single X's and latent parts of the single X's, respectively. The part appearance templates can also be shared among different single X's. In detection, a dynamic programming (DP) algorithm is used and as a natural consequence we decouple the two single X's from the X-to-X occluding pairs. In experiments, we test our method on roadside cars which are collected from real traffic video surveillance environment by ourselves. We compare our model with the state-of-the-art deformable part-based model (DPM) and obtain better detection performance.