Style modeling for tagging personal photo collections

Authors:
Manni Duan;Adrian Ulges;Thomas M. Breuel;Xiu-qing Wu
Affiliations:
USTC, Hefei, China;German Research Center for Artificial Intelligence (DFKI), Kaiserslautern, Germany;DFKI and TU Kaiserslautern, Kaiserslautern, Germany;USTC, Hefei, China
Venue:
Proceedings of the ACM International Conference on Image and Video Retrieval
Year:
2009

Citing 15
Cited 4

Unsupervised learning by probabilistic latent semantic analysis

Machine Learning
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Classification Using a Hierarchical Bayesian Approach

ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 4 - Volume 4
Automatic image annotation and retrieval using cross-media relevance models

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
PLSA-based image auto-annotation: constraining the latent space

Proceedings of the 12th annual ACM international conference on Multimedia
Style Consistent Classification of Isogenous Patterns

IEEE Transactions on Pattern Analysis and Machine Intelligence
Leveraging context to resolve identity in photo albums

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Region based image annotation through multiple-instance learning

Proceedings of the 13th annual ACM international conference on Multimedia
Supervised Learning of Semantic Classes for Image Annotation and Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
Analyzing Flickr groups

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Real-Time Computerized Annotation of Pictures

IEEE Transactions on Pattern Analysis and Machine Intelligence
Image annotation using personal calendars as context

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Multiple Bernoulli relevance models for image and video annotation

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
A Thousand Words in a Scene

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Study of Quality Issues for Image Auto-Annotation With the Corel Dataset

IEEE Transactions on Circuits and Systems for Video Technology

Collection-based sparse label propagation and its application on social group suggestion from photos

ACM Transactions on Intelligent Systems and Technology (TIST)
Semi-automatic flickr group suggestion

MMM'11 Proceedings of the 17th international conference on Advances in multimedia modeling - Volume Part II
Personalizing automated image annotation using cross-entropy

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Interactive social group recommendation for Flickr photos

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

While current image annotation methods treat each input image individually, users in practice tend to take multiple pictures at the same location, with the same setup, or over the same trip, such that the images to be labeled come in groups sharing a coherent "style". We present an approach for annotating such style-consistent batches of pictures. The method is inspired by previous work in handwriting recognition and models style as a latent random variable. For each style, a separate image annotation model is learned. When annotating a batch of images, style is inferred using maximum likelihood over the whole batch, and the style-specific model is used for an accurate tagging. In quantitative experiments on the COREL dataset and real-world photo stock downloaded from Flickr, we demonstrate that -- by making use of the additional information that images come in style-consistent groups -- our approach outperforms several baselines that tag images individually. Relative performance improvements of up to 80% are achieved, and on the COREL-5K benchmark the proposed method gives a mean recall/precision of 39%/25%, which is the best result reported to date.