N-Dimensional Tensor Voting and Application to Epipolar Geometry Estimation

  • Authors:
  • Chi-Keung Tang;Gérard Medioni;Mi-Suen Lee

  • Affiliations:
  • Hong Kong Univ. of Science and Technology, Clear Water Bay, Hong Kong;Univ. of Southern California;Philips Research USA, NY

  • Venue:
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Year:
  • 2001

Quantified Score

Hi-index 0.14

Visualization

Abstract

We address the problem of epipolar geometry estimation efficiently and effectively, by formulating it as one of hyperplane inference from a sparse and noisy point set in an 8D space. Given a set of noisy point correspondences in two images of a static scene without correspondences, even in the presence of moving objects, our method extracts good matches and rejects outliers. The methodology is novel and unconventional, since, unlike most other methods optimizing certain scalar, objective functions, our approach does not involve initialization or any iterative search in the parameter space. Therefore, it is free of the problem of local optima or poor convergence. Further, since no search is involved, it is unnecessary to impose simplifying assumption (such as affine camera or local planar homography) to the scene being analyzed for reducing the search complexity. Subject to the general epipolar constraint only, we detect wrong matches by a novel computation scheme, 8D Tensor Voting, which is an instance of the more general N-dimensional Tensor Voting framework. In essence, the input set of matches is first transformed into a sparse 8D point set. Dense, 8D tensor kernels are then used to vote for the most salient hyperplane that captures all inliers inherent in the input. With this filtered set of matches, the normalized Eight-Point Algorithm can be used to estimate the fundamental matrix accurately. By making use of efficient data structure and locality, our method is both time and space efficient despite the higher dimensionality. We demonstrate the general usefulness of our method using example image pairs for aerial image analysis, with widely different views, and from nonstatic 3D scenes (e.g., basketball game in an indoor stadium). Each example contains a considerable number of wrong matches.