A compact algorithm for rectification of stereo pairs
Machine Vision and Applications
Fast Approximate Energy Minimization via Graph Cuts
IEEE Transactions on Pattern Analysis and Machine Intelligence
Shape Matching and Object Recognition Using Shape Contexts
IEEE Transactions on Pattern Analysis and Machine Intelligence
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
A Performance Evaluation of Local Descriptors
IEEE Transactions on Pattern Analysis and Machine Intelligence
A 3-dimensional sift descriptor and its application to action recognition
Proceedings of the 15th international conference on Multimedia
Speeded-Up Robust Features (SURF)
Computer Vision and Image Understanding
Efficient Dense Scene Flow from Sparse or Dense Stereo Data
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo
IEEE Transactions on Pattern Analysis and Machine Intelligence
Beyond pixels: exploring new representations and applications for motion analysis
Beyond pixels: exploring new representations and applications for motion analysis
PCA-SIFT: a more distinctive representation for local image descriptors
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Exploring ambiguities for monocular non-rigid shape estimation
ECCV'10 Proceedings of the 11th European conference on computer vision conference on Computer vision: Part III
Deformation and illumination invariant feature point descriptor
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Hi-index | 0.00 |
This paper studies the use of temporal consistency to match appearance descriptors and handle complex ambiguities when computing dynamic depth maps from stereo. Previous attempts have designed 3D descriptors over the spacetime volume and have been mostly used for monocular action recognition, as they cannot deal with perspective changes. Our approach is based on a state-of-the-art 2D dense appearance descriptor which we extend in time by means of optical flow priors, and can be applied to wide-baseline stereo setups. The basic idea behind our approach is to capture the changes around a feature point in time instead of trying to describe the spatiotemporal volume. We demonstrate its effectiveness on very ambiguous synthetic video sequences with ground truth data, as well as real sequences.