Dense sampling and fast encoding for 3D model retrieval using bag-of-visual features

  • Authors:
  • Takahiko Furuya;Ryutarou Ohbuchi

  • Affiliations:
  • University of Yamanashi, Takeda, Kofu-shi, Yamanashi-ken, Japan;University of Yamanashi, Takeda, Kofu-shi, Yamanashi-ken, Japan

  • Venue:
  • Proceedings of the ACM International Conference on Image and Video Retrieval
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Our previous shape-based 3D model retrieval algorithm compares 3D shapes by using thousands of local visual features per model. A 3D model is rendered into a set of depth images, and from each image, local visual features are extracted by using the Scale Invariant Feature Transform (SIFT) algorithm by Lowe. To efficiently compare among large sets of local features, the algorithm employs bag-of-features approach to integrate the local features into a feature vector per model. The algorithm outperformed other methods for a dataset containing highly articulated yet geometrically simple 3D models. For a dataset containing diverse and detailed models, the method did only as well as other methods. This paper proposes an improved algorithm that performs equal or better than our previous method for both articulated and rigid but geometrically detailed models. The proposed algorithm extracts much larger number of local visual features by sampling each depth image densely and randomly. To contain computational cost, the method utilizes GPU for SIFT feature extraction and an efficient randomized decision tree for encoding SIFT features into visual words. Empirical evaluation showed that the proposed method is very fast, yet significantly outperforms our previous method for rigid and geometrically detailed models. For the simple yet articulated models, the performance was virtually unchanged.