Cross-modal correlation learning for clustering on image-audio dataset

Authors:
Hong Zhang;Yueting Zhuang;Fei Wu
Affiliations:
Zhejiang University, Hangzhou, China & Wuhan University of Science & Technology, Wuhan, China;Zhejiang University, Hangzhou, China;Zhejiang University, Hangzhou, China
Venue:
Proceedings of the 15th international conference on Multimedia
Year:
2007

Citing 7
Cited 13

Audio Retrieval with Fast Relevance Feedback Based on Constrained Fuzzy Clustering and Stored Index Table

PCM '02 Proceedings of the Third IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Automatic multimedia cross-modal correlation discovery

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Canonical Correlation Analysis: An Overview with Application to Learning Methods

Neural Computation
Complementary information retrieval for cross-media news content

Information Systems
Manifold Learning Based Cross-media Retrieval: A Solution to Media Object Complementary Nature

Journal of VLSI Signal Processing Systems
Bridging the gap between visual and auditory feature spaces for cross-media retrieval

MMM'07 Proceedings of the 13th international conference on Multimedia Modeling - Volume Part I
Content-based audio classification and retrieval by support vector machines

IEEE Transactions on Neural Networks

Active post-refined multimodality video semantic concept detection with tensor representation

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Heterogeneous multimedia data semantics mining using content and location context

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Ranking with local regression and global alignment for cross media retrieval

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Tensor-based transductive learning for multimodality video semantic concept detection

IEEE Transactions on Multimedia
Combining location and feature information for multimedia retrieval

International Journal of Computer Applications in Technology
Boosting multimodal semantic understanding by local similarity adaptation and global correlation propagation

PCM'10 Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing: Part I
Accelerating Machine-Learning Algorithms on FPGAs using Pattern-Based Decomposition

Journal of Signal Processing Systems
A novel multi-modal integration and propagation model for cross-media information retrieval

MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Semi-supervised distance metric learning based on local linear regression for data clustering

Neurocomputing
Movie keyframe retrieval based on cross-media correlation detection and context model

IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence
Multimodal human behavior analysis: learning correlation and interaction across modalities

Proceedings of the 14th ACM international conference on Multimodal interaction
Towards large scale cross-media retrieval via modeling heterogeneous information and exploring an efficient indexing scheme

CVM'12 Proceedings of the First international conference on Computational Visual Media
Fusing inherent and external knowledge with nonlinear learning for cross-media retrieval

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is interesting and challenging to explore correlations between different datasets and utilize such correlations for the clustering on these datasets. Cross-modal correlation between images and audios can help identify images (or audios) of certain semantics. However, the heterogeneous problem makes it difficult to learn cross-modal correlation between visual and auditory features. In this paper, we analyze canonical correlation between feature matrices of images and audios during subspace mapping; then we design correlation-based similarity reinforcement for images and audios; thirdly we implement image clustering and audio clustering with affinity propagation. Experiment results on image-audio dataset are encouraging and show that the performance of our approach is effective. We give an interesting application of querying images by audio examples.