Object Categorization by Learned Universal Visual Dictionary

Authors:
J. Winn;A. Criminisi;T. Minka
Affiliations:
Microsoft Research;Microsoft Research;Microsoft Research
Venue:
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Year:
2005

Citing 0
Cited 128

Multimodal fusion using learned text concepts for image categorization

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Object categorization

Foundations and Trends® in Computer Graphics and Vision
LabelMe: A Database and Web-Based Tool for Image Annotation

International Journal of Computer Vision
Semantic image classification using statistical local spatial relations model

Multimedia Tools and Applications
Local invariant feature detectors: a survey

Foundations and Trends® in Computer Graphics and Vision
Constructing visual phrases for effective and efficient object-based image retrieval

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Learning an Alphabet of Shape and Appearance for Multi-Class Object Detection

International Journal of Computer Vision
A survey of methods for image annotation

Journal of Visual Languages and Computing
Multi-Class Segmentation with Relative Location Prior

International Journal of Computer Vision
Learning Distance Functions for Automatic Annotation of Images

Adaptive Multimedial Retrieval: Retrieval, User, and Semantics
Automatic Image Annotation Using a Visual Dictionary Based on Reliable Image Segmentation

Adaptive Multimedial Retrieval: Retrieval, User, and Semantics
A Probabilistic Model for User Relevance Feedback on Image Retrieval

MLMI '08 Proceedings of the 5th international workshop on Machine Learning for Multimodal Interaction
Unsupervised modeling and recognition of object categories with combination of visual contents and geometric similarity links

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Performance evaluation of local colour invariants

Computer Vision and Image Understanding
Localizing Objects with Smart Dictionaries

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
Material-specific adaptation of color invariant features

Pattern Recognition Letters
Context First

SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Similarity Measure of the Visual Features Using the Constrained Hierarchical Clustering for Content Based Image Retrieval

ISVC '08 Proceedings of the 4th International Symposium on Advances in Visual Computing, Part II
TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context

International Journal of Computer Vision
Video retrieval based on object discovery

Computer Vision and Image Understanding
Automatic joint classification and segmentation of whole cell 3D images

Pattern Recognition
Using visual and text features for direct marketing on multimedia messaging services domain

Multimedia Tools and Applications
Latent mixture vocabularies for object categorization and segmentation

Image and Vision Computing
Approximate Bayesian methods for kernel-based object tracking

Computer Vision and Image Understanding
Learning non-redundant codebooks for classifying complex objects

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Bag-of-Features Codebook Generation by Self-Organisation

WSOM '09 Proceedings of the 7th International Workshop on Advances in Self-Organizing Maps
Class Representative Visual Words for Category-Level Object Recognition

IbPRIA '09 Proceedings of the 4th Iberian Conference on Pattern Recognition and Image Analysis
Multiclass Object Recognition Based on Texture Linear Genetic Programming

Proceedings of the 2007 EvoWorkshops 2007 on EvoCoMnet, EvoFIN, EvoIASP,EvoINTERACTION, EvoMUSART, EvoSTOC and EvoTransLog: Applications of Evolutionary Computing
Foreground Focus: Unsupervised Learning from Partially Matching Images

International Journal of Computer Vision
Image-based street-side city modeling

ACM SIGGRAPH Asia 2009 papers
Multi-video synopsis for video representation

Signal Processing
Descriptive visual words and visual phrases for image applications

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Dense sampling and fast encoding for 3D model retrieval using bag-of-visual features

Proceedings of the ACM International Conference on Image and Video Retrieval
Machine learning in ecosystem informatics and sustainability

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Concept-Specific Visual Vocabulary Construction for Object Categorization

PCM '09 Proceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
The segmented and annotated IAPR TC-12 benchmark

Computer Vision and Image Understanding
Comparing compact codebooks for visual categorization

Computer Vision and Image Understanding
Image retrieval based on multi-texton histogram

Pattern Recognition
A study of vocabularies for image annotation

SAMT'07 Proceedings of the semantic and digital media technologies 2nd international conference on Semantic Multimedia
Category sensitive codebook construction for object category recognition

ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Exploring the bag-of-words method for 3D shape retrieval

ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Parking space detection from video by augmenting training dataset

ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Measuring conceptual relation of visual words for visual categorization

ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Learning Robust Similarity Measures for 3D Partial Shape Retrieval

International Journal of Computer Vision
Image inpainting by patch propagation using patch sparsity

IEEE Transactions on Image Processing
Multi-level pixel-based texture classification through efficient prototype selection via normalized cut

Pattern Recognition
Scalable large-margin Mahalanobis distance metric learning

IEEE Transactions on Neural Networks
Discriminative codeword selection for image representation

Proceedings of the international conference on Multimedia
Building contextual visual vocabulary for large-scale image applications

Proceedings of the international conference on Multimedia
Distance metric learning and feature combination for shape-based 3D model retrieval

Proceedings of the ACM workshop on 3D object retrieval
From region based image representation to object discovery and recognition

SSPR&SPR'10 Proceedings of the 2010 joint IAPR international conference on Structural, syntactic, and statistical pattern recognition
Max-margin dictionary learning for multiclass image categorization

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Image segmentation with topic random field

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Investigating the bag-of-words method for 3D shape retrieval

EURASIP Journal on Advances in Signal Processing - Special issue on advanced image processing for defense and security applications
Optimal operations for visual categorization

ICIMCS '10 Proceedings of the Second International Conference on Internet Multimedia Computing and Service
Cost-Sensitive Active Visual Category Learning

International Journal of Computer Vision
Personalization in multimedia retrieval: A survey

Multimedia Tools and Applications
Incremental Linear Discriminant Analysis Using Sufficient Spanning Sets and Its Applications

International Journal of Computer Vision
Towards a more discriminative and semantic visual vocabulary

Computer Vision and Image Understanding
Robust object categorization and segmentation motivated by visual contexts in the human visual system

EURASIP Journal on Advances in Signal Processing - Special issue on biologically inspired signal processing: analyses, algorithms and applications
Exploiting Textons distributions on spatial hierarchy for scene classification

Journal on Image and Video Processing - Special issue on selected papers from multimedia modeling conference 2009
Optimizing visual vocabularies using soft assignment entropies

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part IV
An energy-based model for region-labeling

Computer Vision and Image Understanding
Adaptive learning codebook for action recognition

Pattern Recognition Letters
Region Contextual Visual Words for scene categorization

Expert Systems with Applications: An International Journal
A kernel density based approach for large scale image retrieval

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Semantics extraction from images

Knowledge-driven multimedia information extraction and ontology evolution
Supervised visual vocabulary with category information

ACIVS'11 Proceedings of the 13th international conference on Advanced concepts for intelligent vision systems
Improved support vector machines with distance metric learning

ACIVS'11 Proceedings of the 13th international conference on Advanced concepts for intelligent vision systems
Recent advances and trends in visual tracking: A review

Neurocomputing
Improvements in image categorization using codebook ensembles

Image and Vision Computing
Letters: A unified supervised codebook learning framework for classification

Neurocomputing
Images as sets of locally weighted features

Computer Vision and Image Understanding
Efficient and Effective Visual Codebook Generation Using Additive Kernels

The Journal of Machine Learning Research
Improving spatiotemporal inpainting with layer appearance models

ISVC'06 Proceedings of the Second international conference on Advances in Visual Computing - Volume Part II
TextonBoost: joint appearance, shape and context modeling for multi-class object recognition and segmentation

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Weakly supervised learning of part-based spatial models for visual object recognition

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Adapted vocabularies for generic visual categorization

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Sampling strategies for bag-of-features image classification

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Discriminative compact pyramids for object and scene recognition

Pattern Recognition
Learning semantic features for action recognition via diffusion maps

Computer Vision and Image Understanding
A scalable algorithm for learning a mahalanobis distance metric

ACCV'09 Proceedings of the 9th Asian conference on Computer Vision - Volume Part III
Multimodal indexing based on semantic cohesion for image retrieval

Information Retrieval
Context-Aware Semi-Local Feature Detector

ACM Transactions on Intelligent Systems and Technology (TIST)
Single-Histogram class models for image segmentation

ICVGIP'06 Proceedings of the 5th Indian conference on Computer Vision, Graphics and Image Processing
Automatic image annotation with cooperation of concept-specific and universal visual vocabularies

MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
Bag of spatio-temporal synonym sets for human action recognition

MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
Semantic classification in aerial imagery by integrating appearance and height information

ACCV'09 Proceedings of the 9th Asian conference on Computer Vision - Volume Part II
Image classification based on weighted topics

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Road image segmentation and recognition using hierarchical bag-of-textons method

PSIVT'11 Proceedings of the 5th Pacific Rim conference on Advances in Image and Video Technology - Volume Part I
Building implicit dictionaries based on extreme random clustering for modality recognition

MCBR-CDS'11 Proceedings of the Second MICCAI international conference on Medical Content-Based Retrieval for Clinical Decision Support
Modulating Shape Features by Color Attention for Object Recognition

International Journal of Computer Vision
Multi-class particle swarm model selection for automatic image annotation

Expert Systems with Applications: An International Journal
A review on vision techniques applied to Human Behaviour Analysis for Ambient-Assisted Living

Expert Systems with Applications: An International Journal
Supervised learning of Gaussian mixture models for visual vocabulary generation

Pattern Recognition
A Review of Codebook Models in Patch-Based Visual Object Recognition

Journal of Signal Processing Systems
Exploring two spaces with one feature: kernelized multidimensional modeling of visual alphabets

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Improving Image Classification Using Semantic Attributes

International Journal of Computer Vision
Rank-loss support instance machines for MIML instance annotation

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Texton theory revisited: A bag-of-words approach to combine textons

Pattern Recognition
Positive semidefinite metric learning using boosting-like algorithms

The Journal of Machine Learning Research
Learning to place new objects in a scene

International Journal of Robotics Research
Enhancing image retrieval by an exploration-exploitation approach

MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
Compact and adaptive spatial pyramids for scene recognition

Image and Vision Computing
SHREC'09 track: querying with partial models

EG 3DOR'09 Proceedings of the 2nd Eurographics conference on 3D Object Retrieval
Coarse Iris classification by learned visual dictionary

ICB'07 Proceedings of the 2007 international conference on Advances in Biometrics
Intelligent multi-camera video surveillance: A review

Pattern Recognition Letters
Entropy based supervised merging for visual categorization

ACIVS'12 Proceedings of the 14th international conference on Advanced Concepts for Intelligent Vision Systems
Randomized spatial partition for scene recognition

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Approximate gaussian mixtures for large scale vocabularies

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Visual dictionary learning for joint object categorization and segmentation

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Analyzing the subspace structure of related images: concurrent segmentation of image sets

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
Rapid image segmentation using color, texture and syntactic visual features

AICI'12 Proceedings of the 4th international conference on Artificial Intelligence and Computational Intelligence
Are buildings only instances?: exploration in architectural style categories

Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
Aerial scene recognition using efficient sparse representation

Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing
Weakly supervised codebook learning by iterative label propagation with graph quantization

Signal Processing
Cross-Database transfer learning via learnable and discriminant error-correcting output codes

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part I
Instance Annotation for Multi-Instance Multi-Label Learning

ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on ACM SIGKDD 2012
Object class detection: A survey

ACM Computing Surveys (CSUR)
An experimental study on the universality of visual vocabularies

Journal of Visual Communication and Image Representation
HEGM: A hierarchical elastic graph matching for hand gesture recognition

Pattern Recognition
Unsupervised approximate-semantic vocabulary learning for human action and video classification

Pattern Recognition Letters
The multi-feature information bottleneck with application to unsupervised image categorization

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Learning group-based dictionaries for discriminative image representation

Pattern Recognition
Tagging-by-search: automatic image region labeling using gaze information obtained from image search

Proceedings of the 19th international conference on Intelligent User Interfaces
Learning structured visual dictionary for object tracking

Image and Vision Computing
Generative Methods for Long-Term Place Recognition in Dynamic Scenes

International Journal of Computer Vision
Image Classification with the Fisher Vector: Theory and Practice

International Journal of Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a new algorithm for the automatic recognition of object classes from images (categorization). Compact and yet discriminative appearance-based object class models are automatically learned from a set of training images. The method is simple and extremely fast, making it suitable for many applications such as semantic image retrieval, web search, and interactive image editing. Itclassifies a region according to the proportions of different visual words (clusters in feature space). The specific visual words and the typical proportions in each object are learned from a segmented training set. The main contribution of this paper is two fold: i) an optimally compact visual dictionary is learned by pair-wise merging of visual words from an initially large dictionary. The final visual words are described by GMMs. ii) A novel statistical measure of discrimination is proposed which is optimized by each merge operation. High classification accuracy is demonstrated for nine object classes on photographs of real objects viewed under general lighting conditions, poses and viewpoints. The set of test images used for validation comprise: i) photographs acquired by us, ii) images from the web and iii) images from the recently released Pascal dataset. The proposed algorithm performs well on both texture-rich objects (e.g. grass, sky, trees) and structure-rich ones (e.g. cars, bikes, planes).