A Quantitative Analysis of Current Practices in Optical Flow Estimation and the Principles Behind Them

  • Authors:
  • Deqing Sun;Stefan Roth;Michael J. Black

  • Affiliations:
  • School of Engineering and Applied Sciences, Harvard University, Cambridge, USA;Department of Computer Science, TU Darmstadt, Darmstadt, Germany;Max Planck Institute for Intelligent Systems, Tübingen, Germany

  • Venue:
  • International Journal of Computer Vision
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

The accuracy of optical flow estimation algorithms has been improving steadily as evidenced by results on the Middlebury optical flow benchmark. The typical formulation, however, has changed little since the work of Horn and Schunck. We attempt to uncover what has made recent advances possible through a thorough analysis of how the objective function, the optimization method, and modern implementation practices influence accuracy. We discover that "classical" flow formulations perform surprisingly well when combined with modern optimization and implementation techniques. One key implementation detail is the median filtering of intermediate flow fields during optimization. While this improves the robustness of classical methods it actually leads to higher energy solutions, meaning that these methods are not optimizing the original objective function. To understand the principles behind this phenomenon, we derive a new objective function that formalizes the median filtering heuristic. This objective function includes a non-local smoothness term that robustly integrates flow estimates over large spatial neighborhoods. By modifying this new term to include information about flow and image boundaries we develop a method that can better preserve motion details. To take advantage of the trend towards video in wide-screen format, we further introduce an asymmetric pyramid downsampling scheme that enables the estimation of longer range horizontal motions. The methods are evaluated on the Middlebury, MPI Sintel, and KITTI datasets using the same parameter settings.