Texture Features for Browsing and Retrieval of Image Data
IEEE Transactions on Pattern Analysis and Machine Intelligence
Content-Based Image Retrieval at the End of the Early Years
IEEE Transactions on Pattern Analysis and Machine Intelligence
Object Recognition from Local Scale-Invariant Features
ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Early versus late fusion in semantic video analysis
Proceedings of the 13th annual ACM international conference on Multimedia
Early versus late fusion in semantic video analysis
Proceedings of the 13th annual ACM international conference on Multimedia
Content-based multimedia information retrieval: State of the art and challenges
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
A Unified Log-Based Relevance Feedback Scheme for Image Retrieval
IEEE Transactions on Knowledge and Data Engineering
Prediction, Learning, and Games
Prediction, Learning, and Games
Learning Distance Metrics with Contextual Constraints for Image Retrieval
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
A fast learning algorithm for deep belief nets
Neural Computation
Online Passive-Aggressive Algorithms
The Journal of Machine Learning Research
Evaluating bag-of-visual-words representations in scene classification
Proceedings of the international workshop on Workshop on multimedia information retrieval
Confidence-weighted linear classification
Proceedings of the 25th international conference on Machine learning
Extracting and composing robust features with denoising autoencoders
Proceedings of the 25th international conference on Machine learning
Information Fusion in Multimedia Information Retrieval
Adaptive Multimedial Retrieval: Retrieval, User, and Semantics
VisualRank: Applying PageRank to Large-Scale Image Search
IEEE Transactions on Pattern Analysis and Machine Intelligence
Localized Content-Based Image Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
International Journal of Approximate Reasoning
Learning Deep Architectures for AI
Foundations and Trends® in Machine Learning
Large Scale Online Learning of Image Similarity Through Ranking
The Journal of Machine Learning Research
Semantics-preserving bag-of-words models and applications
IEEE Transactions on Image Processing
The Journal of Machine Learning Research
Learning Multi-modal Similarity
The Journal of Machine Learning Research
Double Updating Online Learning
The Journal of Machine Learning Research
Bilinear deep learning for image classification
MM '11 Proceedings of the 19th ACM international conference on Multimedia
SURF: speeded up robust features
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Multiview Metric Learning with Global Consistency and Local Smoothness
ACM Transactions on Intelligent Systems and Technology (TIST)
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition
IEEE Transactions on Audio, Speech, and Language Processing
Aggregating Local Image Descriptors into Compact Codes
IEEE Transactions on Pattern Analysis and Machine Intelligence
Deep Learning to Hash with Multiple Representations
ICDM '12 Proceedings of the 2012 IEEE 12th International Conference on Data Mining
Hi-index | 0.00 |
Recent years have witnessed extensive studies on distance metric learning (DML) for improving similarity search in multimedia information retrieval tasks. Despite their successes, most existing DML methods suffer from two critical limitations: (i) they typically attempt to learn a linear distance function on the input feature space, in which the assumption of linearity limits their capacity of measuring the similarity on complex patterns in real-world applications; (ii) they are often designed for learning distance metrics on uni-modal data, which may not effectively handle the similarity measures for multimedia objects with multimodal representations. To address these limitations, in this paper, we propose a novel framework of online multimodal deep similarity learning (OMDSL), which aims to optimally integrate multiple deep neural networks pretrained with stacked denoising autoencoder. In particular, the proposed framework explores a unified two-stage online learning scheme that consists of (i) learning a flexible nonlinear transformation function for each individual modality, and (ii) learning to find the optimal combination of multiple diverse modalities simultaneously in a coherent process. We conduct an extensive set of experiments to evaluate the performance of the proposed algorithms for multimodal image retrieval tasks, in which the encouraging results validate the effectiveness of the proposed technique.