Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
The Journal of Machine Learning Research
On Intelligence
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
PLSA-based image auto-annotation: constraining the latent space
Proceedings of the 12th annual ACM international conference on Multimedia
Shape Matching and Object Recognition Using Low Distortion Correspondences
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Scalable Recognition with a Vocabulary Tree
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Image retrieval on large-scale image databases
Proceedings of the 6th ACM international conference on Image and video retrieval
Proceedings of the 15th international conference on Multimedia
Continuous visual vocabulary modelsfor pLSA-based scene recognition
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Deep networks for image retrieval on large-scale databases
MM '08 Proceedings of the 16th ACM international conference on Multimedia
SURF: speeded up robust features
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Multimodal ranking for image search on community databases
Proceedings of the international conference on Multimedia information retrieval
Multi modal semantic indexing for image retrieval
Proceedings of the ACM International Conference on Image and Video Retrieval
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Correlated PLSA for image clustering
MMM'11 Proceedings of the 17th international conference on Advances in multimedia modeling - Volume Part I
Toward a higher-level visual representation for content-based image retrieval
Proceedings of the 8th International Conference on Advances in Mobile Computing and Multimedia
Multi-feature pLSA for combining visual features in image annotation
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Leveraging community metadata for multimodal image ranking
Multimedia Tools and Applications
Topic based query suggestions for video search
MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Toward a higher-level visual representation for content-based image retrieval
Multimedia Tools and Applications
An automated vision based on-line novel percept detection method for a mobile robot
Robotics and Autonomous Systems
Multimedia Tools and Applications
A semantic model for cross-modal and multi-modal retrieval
Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
High order pLSA for indexing tagged images
Signal Processing
Web media semantic concept retrieval via tag removal and model fusion
ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers
A feature-word-topic model for image annotation and retrieval
ACM Transactions on the Web (TWEB)
Hi-index | 0.00 |
It is current state of knowledge that our neocortex consists of six layers [10]. We take this knowledge from neuroscience as an inspiration to extend the standard single-layer probabilistic Latent Semantic Analysis (pLSA) [13] to multiple layers. As multiple layers should naturally handle multiple modalities and a hierarchy of abstractions, we denote this new approach multilayer multimodal probabilistic Latent Semantic Analysis (mm-pLSA). We derive the training and inference rules for the smallest possible non-degenerated mm-pLSA model: a model with two leaf-pLSAs (here from two different data modalities: image tags and visual image features) and a single top-level pLSA node merging the two leaf-pLSAs. From this derivation it is obvious how to extend the learning and inference rules to more modalities and more layers. We also propose a fast and strictly stepwise forward procedure to initialize bottom-up the mm-pLSA model, which in turn can then be post-optimized by the general mm-pLSA learning algorithm. We evaluate the proposed approach experimentally in a query-by-example retrieval task using 50-dimensional topic vectors as image models. We compare various variants of our mm-pLSA system to systems relying solely on visual features or tag features and analyze possible pitfalls of the mm-pLSA training. It is shown that the best variant of the the proposed mm-pLSA system outperforms the unimodal systems by approximately 19% in our query-by-example task.