Solving the multiple instance problem with axis-parallel rectangles
Artificial Intelligence
Mean Shift Based Clustering in High Dimensions: A Texture Classification Example
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Video Google: A Text Retrieval Approach to Object Matching in Videos
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Creating Efficient Codebooks for Visual Recognition
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Object Categorization by Learned Universal Visual Dictionary
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Scalable Recognition with a Vocabulary Tree
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
International Journal of Computer Vision
Scene Classification Using a Hybrid Generative/Discriminative Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
Proactive learning: cost-sensitive active learning with multiple imperfect oracles
Proceedings of the 17th ACM conference on Information and knowledge management
Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Supervised Learning of Quantizer Codebooks by Information Loss Minimization
IEEE Transactions on Pattern Analysis and Machine Intelligence
Good learners for evil teachers
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Supervised learning from multiple experts: whom to trust when everyone lies a bit
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
WordNet: similarity - measuring the relatedness of concepts
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
A Convex Method for Locating Regions of Interest with Multi-instance Learning
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Unified video annotation via multigraph learning
IEEE Transactions on Circuits and Systems for Video Technology
Beyond distance measurement: constructing neighborhood similarity for video annotation
IEEE Transactions on Multimedia - Special section on communities and media computing
Towards low bit rate mobile visual search with multiple-channel coding
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Adapted vocabularies for generic visual categorization
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Visual cue cluster construction via information bottleneck principle and kernel density estimation
CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
Location Discriminative Vocabulary Coding for Mobile Landmark Search
International Journal of Computer Vision
Context-Aware Semi-Local Feature Detector
ACM Transactions on Intelligent Systems and Technology (TIST)
A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback
IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Multimedia
Towards a Relevant and Diverse Search of Social Images
IEEE Transactions on Multimedia
Learning compact visual descriptor for low bit rate mobile landmark search
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Towards compact topical descriptors
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Hi-index | 0.08 |
Visual codebook serves as a fundamental component in many state-of-the-art visual search and object recognition systems. While most existing codebooks are built based solely on unsupervised patch quantization, there are few works exploited image labels to supervise its construction. The key challenge lies in the following: image labels are global, but patch supervision should be local. Such imbalanced supervision is beyond the scope of most existing supervised codebooks [9,10,12-15,29]. In this paper, we propose a weakly supervised codebook learning framework, which integrates image labels to supervise codebook building with two steps: the Label Propagation step propagates image labels into local patches by multiple instance learning and instance selection [20,21]. The Graph Quantization step integrates patch labels to build codebook using Mean Shift. Both steps are co-optimized in an Expectation Maximization framework: the E-phase selects the best patches that minimize the semantic distortions in quantization to propagate image labels; while the M-phase groups similar patches with related labels (modeled by WordNet [18]), which minimizes the visual distortions in quantization. In quantitative experiments, our codebook outperforms state-of-the-art unsupervised and supervised codebooks [1,10,11,25,29] using benchmark datasets.