VisualSEEk: a fully automated content-based image query system
MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Supporting similarity queries in MARS
MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Content-Based Image Retrieval at the End of the Early Years
IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying
IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised Feature Selection Applied to Content-Based Retrieval of Lung Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Variational Extensions to EM and Multinomial PCA
ECML '02 Proceedings of the 13th European Conference on Machine Learning
The Truth about Corel - Evaluation in Image Retrieval
CIVR '02 Proceedings of the International Conference on Image and Video Retrieval
Normalized Cuts and Image Segmentation
CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Automatic image annotation and retrieval using cross-media relevance models
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
The Journal of Machine Learning Research
The Journal of Machine Learning Research
On image auto-annotation with latent space models
MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
International Journal of Computer Vision - Special Issue on Content-Based Image Retrieval
Robust Real-Time Face Detection
International Journal of Computer Vision
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Learning to Detect Objects in Images via a Sparse, Part-Based Representation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Leveraging face recognition technology to find and organize photos
Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval
Test Data Likelihood for PLSA Models
Information Retrieval
A Bayesian Hierarchical Model for Learning Natural Scene Categories
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Discovering Objects and their Localization in Images
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Modeling Scenes with Local Descriptors and Latent Aspects
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Multimedia semantic indexing using model vectors
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 1
Multiple Bernoulli relevance models for image and video annotation
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines
IEEE Transactions on Circuits and Systems for Video Technology
Automatic web image selection with a probabilistic latent topic model
Proceedings of the 17th international conference on World Wide Web
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Face Image Annotation Based on Latent Semantic Space and Rules
KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part II
Exploring multimedia in a keyword space
MM '08 Proceedings of the 16th ACM international conference on Multimedia
What did you do today?: discovering daily routines from large-scale mobile data
MM '08 Proceedings of the 16th ACM international conference on Multimedia
Can Geotags Help Image Recognition?
PSIVT '09 Proceedings of the 3rd Pacific Rim Symposium on Advances in Image and Video Technology
Histogram of oriented rectangles: A new pose descriptor for human action recognition
Image and Vision Computing
PLSI: The True Fisher Kernel and beyond
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Expert Systems with Applications: An International Journal
Scene classification using pLSA with visterm spatial location
IMCE '09 Proceedings of the 1st international workshop on Interactive multimedia for consumer electronics
Canonical contextual distance for large-scale image annotation and retrieval
LS-MMRM '09 Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining
Tagging and retrieving images with co-occurrence models: from corel to flickr
LS-MMRM '09 Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining
Music information retrieval using social tags and audio
IEEE Transactions on Multimedia - Special section on communities and media computing
Learning image semantics with latent aspect model
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Image annotation and retrieval based on efficient learning of contextual latent space
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Region-based automatic web image selection
Proceedings of the international conference on Multimedia information retrieval
Image retrieval based on multi-texton histogram
Pattern Recognition
A shared-subspace learning framework for multi-label classification
ACM Transactions on Knowledge Discovery from Data (TKDD)
Human action recognition using distribution of oriented rectangular patches
Proceedings of the 2nd conference on Human motion: understanding, modeling, capture and animation
A hybrid unsupervised image re-ranking approach with latent topic contents
Proceedings of the ACM International Conference on Image and Video Retrieval
NMF-based multimodal image indexing for querying by visual example
Proceedings of the ACM International Conference on Image and Video Retrieval
Modeling latent aspects for automatic image annotation
ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Visual information in semantic representation
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Topic models for image annotation and text illustration
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
How many words is a picture worth? Automatic caption generation for news images
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
IPSILON: incremental parsing for semantic indexing of latent concepts
IEEE Transactions on Image Processing
A feature-word-topic model for image annotation
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Fusing semantic aspects for image annotation and retrieval
Journal of Visual Communication and Image Representation
A new approach to cross-modal multimedia retrieval
Proceedings of the international conference on Multimedia
Auto-tagging of images in non-english languages using tag language conversion
Proceedings of the international conference on Multimedia
Discovering multipart appearance models from captioned images
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
Discovering routines from large-scale human locations using probabilistic topic models
ACM Transactions on Intelligent Systems and Technology (TIST)
Modeling continuous visual features for semantic image annotation and retrieval
Pattern Recognition Letters
Two-probabilistic latent semantic model for image annotation and retrieval
ACCV'10 Proceedings of the 2010 international conference on Computer vision - Volume Part I
KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part IV
Topic based query suggestions for video search
MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
A novel multi-modal integration and propagation model for cross-media information retrieval
MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Discovering hierarchical object models from captioned images
Computer Vision and Image Understanding
A feature-word-topic model for image annotation and retrieval
ACM Transactions on the Web (TWEB)
Learning semantic concepts from image database with hybrid generative/discriminative approach
Engineering Applications of Artificial Intelligence
Nonparametric bayesian upstream supervised multi-modal topic models
Proceedings of the 7th ACM international conference on Web search and data mining
Cross domain recommendation based on multi-type media fusion
Neurocomputing
Hi-index | 0.15 |
To go beyond the query-by-example paradigm in image retrieval, there is a need for semantic indexing of large image collections for intuitive text-based image search. Different models have been proposed to learn the dependencies between the visual content of an image set and the associated text captions, then allowing for the automatic creation of semantic indices for unannotated images. The task, however, remains unsolved. In this paper, we present three alternatives to learn a Probabilistic Latent Semantic Analysis model (PLSA) for annotated images, and evaluate their respective performance for automatic image indexing. Under the PLSA assumptions, an image is modeled as a mixture of latent aspects that generates both image features and text captions, and we investigate three ways to learn the mixture of aspects. We also propose a more discriminative image representation than the traditional Blob histogram, concatenating quantized local color information and quantized local texture descriptors. The first learning procedure of a PLSA model for annotated images is a standard EM algorithm, which implicitly assumes that the visual and the textual modalities can be treated equivalently. The other two models are based on an asymmetric PLSA learning, allowing to constrain the definition of the latent space on the visual or on the textual modality. We demonstrate that the textual modality is more appropriate to learn a semantically meaningful latent space, which translates into improved annotation performance. A comparison of our learning algorithms with respect to recent methods on a standard dataset is presented, and a detailed evaluation of the performance shows the validity of our framework.