Video Google: A Text Retrieval Approach to Object Matching in Videos
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
A Bayesian Hierarchical Model for Learning Natural Scene Categories
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Object Categorization by Learned Universal Visual Dictionary
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Graph Embedding and Extensions: A General Framework for Dimensionality Reduction
IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words
International Journal of Computer Vision
Supervised Learning of Quantizer Codebooks by Information Loss Minimization
IEEE Transactions on Pattern Analysis and Machine Intelligence
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Adapted vocabularies for generic visual categorization
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Nearest neighbor pattern classification
IEEE Transactions on Information Theory
Hi-index | 0.01 |
In this paper, we investigate a discriminative visual dictionary learning method for boosting the classification performance. Tied to the K-means clustering philosophy, those popular algorithms for visual dictionary learning cannot guarantee the well-separation of the normalized visual word frequency vectors from distinctive classes or large label distances. The rationale of this work is to harness sample label information for learning visual dictionary in a supervised manner, and this target is then formulated as an objective function, where each sample element, e.g., SIFT descriptor, is expected to be close to its assigned visual word, and at the same time the normalized aggregative visual word frequency vectors are expected to possess the property that kindred samples shall be close to each other while inhomogeneous samples shall be far away. By relaxing the hard binary constraints to soft nonnegative ones, a multiplicative nonnegative update procedure is proposed to optimize the objective function along with theoretic convergence proof. Extensive experiments on classification tasks (i.e., natural scene and sports event classifications) all demonstrate the superiority of this proposed framework over conventional clustering based visual dictionary learning.