Supervised learning of Gaussian mixture models for visual vocabulary generation

Authors:
Basura Fernando;Elisa Fromont;Damien Muselet;Marc Sebban
Affiliations:
Université de Lyon, F-42023, Saint-ítienne, France and CNRS, UMR 5516, Laboratoire Hubert Curien, F-42000 Saint-ítienne, France and Université de Saint-ítienne, Jean-Monne ...;Université de Lyon, F-42023, Saint-ítienne, France and CNRS, UMR 5516, Laboratoire Hubert Curien, F-42000 Saint-ítienne, France and Université de Saint-ítienne, Jean-Monne ...;Université de Lyon, F-42023, Saint-ítienne, France and CNRS, UMR 5516, Laboratoire Hubert Curien, F-42000 Saint-ítienne, France and Université de Saint-ítienne, Jean-Monne ...;Université de Lyon, F-42023, Saint-ítienne, France and CNRS, UMR 5516, Laboratoire Hubert Curien, F-42000 Saint-ítienne, France and Université de Saint-ítienne, Jean-Monne ...
Venue:
Pattern Recognition
Year:
2012

Citing 18
Cited 1

Random Forests

Machine Learning
A Fast Globally Supervised Learning Algorithm for Gaussian Mixture Models

WAIM '00 Proceedings of the First International Conference on Web-Age Information Management
Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Scale & Affine Invariant Interest Point Detectors

International Journal of Computer Vision
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 12 - Volume 12
A Bayesian Hierarchical Model for Learning Natural Scene Categories

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Creating Efficient Codebooks for Visual Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Object Categorization by Learned Universal Visual Dictionary

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Scalable Recognition with a Vocabulary Tree

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Evaluating bag-of-visual-words representations in scene classification

Proceedings of the international workshop on Workshop on multimedia information retrieval
Universal and Adapted Vocabularies for Generic Visual Categorization

IEEE Transactions on Pattern Analysis and Machine Intelligence
Supervised Learning of Quantizer Codebooks by Information Loss Minimization

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning non-redundant codebooks for classifying complex objects

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Visual Word Ambiguity

IEEE Transactions on Pattern Analysis and Machine Intelligence
Evaluating Color Descriptors for Object and Scene Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Semi-supervised Gaussian Mixture Model for Image Segmentation

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Adapted vocabularies for generic visual categorization

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Averaging, maximum penalized likelihood and Bayesian estimation for improving Gaussian mixture probability density estimates

IEEE Transactions on Neural Networks

Learning group-based dictionaries for discriminative image representation

Pattern Recognition

Quantified Score

Hi-index	0.01

Visualization

Abstract

The creation of semantically relevant clusters is vital in bag-of-visual words models which are known to be very successful to achieve image classification tasks. Generally, unsupervised clustering algorithms, such as K-means, are employed to create such clusters from which visual dictionaries are deduced. K-means achieves a hard assignment by associating each image descriptor to the cluster with the nearest mean. By this way, the within-cluster sum of squares of distances is minimized. A limitation of this approach in the context of image classification is that it usually does not use any supervision that limits the discriminative power of the resulting visual words (typically the centroids of the clusters). More recently, some supervised dictionary creation methods based on both supervised information and data fitting were proposed leading to more discriminative visual words. But, none of them consider the uncertainty present at both image descriptor and cluster levels. In this paper, we propose a supervised learning algorithm based on a Gaussian mixture model which not only generalizes the K-means algorithm by allowing soft assignments, but also exploits supervised information to improve the discriminative power of the clusters. Technically, our algorithm aims at optimizing, using an EM-based approach, a convex combination of two criteria: the first one is unsupervised and based on the likelihood of the training data; the second is supervised and takes into account the purity of the clusters. We show on two well-known datasets that our method is able to create more relevant clusters by comparing its behavior with the state of the art dictionary creation methods.