Optimal operations for visual categorization

Authors:
Yanyun Cheng;Yanyun Qu;Jianxin Huang;Tianzhu Fang;Shen Lu;Yi Xie
Affiliations:
Xiamen University;Xiamen University;Xiamen University;Xiamen University;Xiamen University;Xiamen University
Venue:
ICIMCS '10 Proceedings of the Second International Conference on Internet Multimedia Computing and Service
Year:
2010

Citing 13
Cited 0

The Design and Use of Steerable Filters

IEEE Transactions on Pattern Analysis and Machine Intelligence
Shape Matching and Object Recognition Using Shape Contexts

IEEE Transactions on Pattern Analysis and Machine Intelligence
Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Scale & Affine Invariant Interest Point Detectors

International Journal of Computer Vision
A Sparse Texture Representation Using Local Affine Regions

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Performance Evaluation of Local Descriptors

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Maximum Entropy Framework for Part-Based Texture and Object Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Deformation Invariant Image Matching

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Object Categorization by Learned Universal Visual Dictionary

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study

International Journal of Computer Vision
SURF: speeded up robust features

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Adapted vocabularies for generic visual categorization

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Sampling strategies for bag-of-features image classification

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV

Quantified Score

Hi-index	0.00

Visualization

Abstract

Bag-of-words is the state-of-the-art method used in visual categorization. The performance of visual categorization depends on four main operations: the detection of interest point, the description of interest point, the design of classifier, and the construction of codebook. In this paper, we focus on the optimizations of the first three operations. Firstly, we compare several popular detectors of interest points and propose an optimal detector combined MSER detector with Hessian-Laplace detector to sample the key points. This detector well combines the interest region with the interest point such that the image can be represented in a hierarchical way. Secondly, we adopt SIFT to describe the sampling region because our experiment results demonstrate that SIFT is more robust than other popular descriptors. Thirdly, we use SVM with RBF kernel for object classification. The proposed classifier outperforms other classifier in terms of the classification accuracy. In order to verify three proposed optimal operations, we implement them in two image datasets: Caltech and KTH-TIPS. The experimental results show that our optimal operations can increase the accuracy of object categorization.