Optimal operations for visual categorization

  • Authors:
  • Yanyun Cheng;Yanyun Qu;Jianxin Huang;Tianzhu Fang;Shen Lu;Yi Xie

  • Affiliations:
  • Xiamen University;Xiamen University;Xiamen University;Xiamen University;Xiamen University;Xiamen University

  • Venue:
  • ICIMCS '10 Proceedings of the Second International Conference on Internet Multimedia Computing and Service
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Bag-of-words is the state-of-the-art method used in visual categorization. The performance of visual categorization depends on four main operations: the detection of interest point, the description of interest point, the design of classifier, and the construction of codebook. In this paper, we focus on the optimizations of the first three operations. Firstly, we compare several popular detectors of interest points and propose an optimal detector combined MSER detector with Hessian-Laplace detector to sample the key points. This detector well combines the interest region with the interest point such that the image can be represented in a hierarchical way. Secondly, we adopt SIFT to describe the sampling region because our experiment results demonstrate that SIFT is more robust than other popular descriptors. Thirdly, we use SVM with RBF kernel for object classification. The proposed classifier outperforms other classifier in terms of the classification accuracy. In order to verify three proposed optimal operations, we implement them in two image datasets: Caltech and KTH-TIPS. The experimental results show that our optimal operations can increase the accuracy of object categorization.