Weakly supervised codebook learning by iterative label propagation with graph quantization

Authors:
Liujuan Cao;Rongrong Ji;Wei Liu;Hongxun Yao;Qi Tian
Affiliations:
Harbin Engineering University, Harbin 150001, China;Columbia University, New York City 10027, United States and Harbin Institute of Technology, Harbin 150001, China;Columbia University, New York City 10027, United States;Harbin Institute of Technology, Harbin 150001, China;University of Texas at San Antonio, San Antonio 78249-1644, United States
Venue:
Signal Processing
Year:
2013

Citing 30
Cited 0

Solving the multiple instance problem with axis-parallel rectangles

Artificial Intelligence
Mean Shift Based Clustering in High Dimensions: A Texture Classification Example

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Creating Efficient Codebooks for Visual Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Object Categorization by Learned Universal Visual Dictionary

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Scalable Recognition with a Vocabulary Tree

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study

International Journal of Computer Vision
Scene Classification Using a Hybrid Generative/Discriminative Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
Proactive learning: cost-sensitive active learning with multiple imperfect oracles

Proceedings of the 17th ACM conference on Information and knowledge management
Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Supervised Learning of Quantizer Codebooks by Information Loss Minimization

IEEE Transactions on Pattern Analysis and Machine Intelligence
Good learners for evil teachers

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Supervised learning from multiple experts: whom to trust when everyone lies a bit

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
WordNet: similarity - measuring the relatedness of concepts

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
A Convex Method for Locating Regions of Interest with Multi-instance Learning

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Unified video annotation via multigraph learning

IEEE Transactions on Circuits and Systems for Video Technology
Beyond distance measurement: constructing neighborhood similarity for video annotation

IEEE Transactions on Multimedia - Special section on communities and media computing
Actor-independent action search using spatiotemporal vocabulary with appearance hashing

Pattern Recognition
Towards low bit rate mobile visual search with multiple-channel coding

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Adapted vocabularies for generic visual categorization

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Visual cue cluster construction via information bottleneck principle and kernel density estimation

CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
Location Discriminative Vocabulary Coding for Mobile Landmark Search

International Journal of Computer Vision
Context-Aware Semi-Local Feature Detector

ACM Transactions on Intelligent Systems and Technology (TIST)
A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback

IEEE Transactions on Pattern Analysis and Machine Intelligence
Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval

IEEE Transactions on Multimedia
Towards a Relevant and Diverse Search of Social Images

IEEE Transactions on Multimedia
Learning compact visual descriptor for low bit rate mobile landmark search

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Towards compact topical descriptors

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Quantified Score

Hi-index	0.08

Visualization

Abstract

Visual codebook serves as a fundamental component in many state-of-the-art visual search and object recognition systems. While most existing codebooks are built based solely on unsupervised patch quantization, there are few works exploited image labels to supervise its construction. The key challenge lies in the following: image labels are global, but patch supervision should be local. Such imbalanced supervision is beyond the scope of most existing supervised codebooks [9,10,12-15,29]. In this paper, we propose a weakly supervised codebook learning framework, which integrates image labels to supervise codebook building with two steps: the Label Propagation step propagates image labels into local patches by multiple instance learning and instance selection [20,21]. The Graph Quantization step integrates patch labels to build codebook using Mean Shift. Both steps are co-optimized in an Expectation Maximization framework: the E-phase selects the best patches that minimize the semantic distortions in quantization to propagate image labels; while the M-phase groups similar patches with related labels (modeled by WordNet [18]), which minimizes the visual distortions in quantization. In quantitative experiments, our codebook outperforms state-of-the-art unsupervised and supervised codebooks [1,10,11,25,29] using benchmark datasets.