A multi-scale learning framework for visual categorization

  • Authors:
  • Shao-Chuan Wang;Yu-Chiang Frank Wang

  • Affiliations:
  • Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan;Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan

  • Venue:
  • ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Spatial pyramid matching has recently become a promising technique for image classification. Despite its success and popularity, no prior work has tackled the problem of learning the optimal spatial pyramid representation for the given image data and the associated object category. We propose a Multiple Scale Learning (MSL) framework to learn the best weights for each scale in the pyramid. Our MSL algorithm would produce class-specific spatial pyramid image representations and thus provide improved recognition performance. We approach the MSL problem as solving a multiple kernel learning (MKL) task, which defines the optimal combination of base kernels constructed at different pyramid levels. A wide range of experiments on Oxford flower and Caltech- 101 datasets are conducted, including the use of state-of-the-art feature encoding and pooling strategies. Finally, excellent empirical results reported on both datasets validate the feasibility of our proposed method.