Understanding multimedia document semantics for cross-media retrieval

Authors:
Fei Wu;Yi Yang;Yueting Zhuang;Yunhe Pan
Affiliations:
College of Computer Science and Engineering, Zhejiang University, Hangzhou, P.R. China;College of Computer Science and Engineering, Zhejiang University, Hangzhou, P.R. China;College of Computer Science and Engineering, Zhejiang University, Hangzhou, P.R. China;College of Computer Science and Engineering, Zhejiang University, Hangzhou, P.R. China
Venue:
PCM'05 Proceedings of the 6th Pacific-Rim conference on Advances in Multimedia Information Processing - Volume Part I
Year:
2005

Citing 10
Cited 5

Content-Based Video Indexing and Retrieval

IEEE MultiMedia
Content-Based Classification, Search, and Retrieval of Audio

IEEE MultiMedia
When Is ''Nearest Neighbor'' Meaningful?

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Search for Multi-modality Data in Digital Libraries

PCM '01 Proceedings of the Second IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Learning an image manifold for retrieval

Proceedings of the 12th annual ACM international conference on Multimedia
Content-based music structure analysis with applications to music semantics understanding

Proceedings of the 12th annual ACM international conference on Multimedia
Efficient content-based retrieval of motion capture data

ACM SIGGRAPH 2005 Papers
ClassView: hierarchical video shot classification, indexing, and accessing

IEEE Transactions on Multimedia
CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines

IEEE Transactions on Circuits and Systems for Video Technology
Content-based audio classification and retrieval by support vector machines

IEEE Transactions on Neural Networks

Manifold Learning Based Cross-media Retrieval: A Solution to Media Object Complementary Nature

Journal of VLSI Signal Processing Systems
Boosting cross-media retrieval by learning with positive and negative examples

MMM'07 Proceedings of the 13th International conference on Multimedia Modeling - Volume Part II
Improving the image retrieval results via topic coverage graph

PCM'06 Proceedings of the 7th Pacific Rim conference on Advances in Multimedia Information Processing
Measuring multi-modality similarities via subspace learning for cross-media retrieval

PCM'06 Proceedings of the 7th Pacific Rim conference on Advances in Multimedia Information Processing
Cross-Media database retrieval system based on TOTEM

WISE'06 Proceedings of the 7th international conference on Web Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multimedia Document (MMD) such as Web Page and Multimedia cyclopedias is composed of media objects of different modalities, and its integrated semantics is always expressed by the combination of all media objects in it. Since the contents in MMDs are enormous and the amount of them is increasing rapidly, effective management of MMDs is in great demand. Meanwhile, it is meaningful to provide users cross-media retrieval facilities so that users can query media objects by examples of different modalities, e.g. users may query an MMD (or an image) by submitting a audio clip and vice versa. However, there exist two challenges to achieve the above goals. First, how can we represent an MMD and fuse media objects together to achieve Cross-index and facilitate Cross-media retrieval? Second, how can we understand MMD semantics? Taking into account of the two problems, we give the definition of MMD and propose a manifold learning method to discover MMD semantics in this paper. We first construct an MMD semi-semantic graph (SSG) and then adopt Multidimensional scaling to create an MMD semantic space (MMDSS). We also propose two periods’ feedbacks. The first one is used to refine SSG and the second one is adopted to introduce new MMD that is not in the MMDSS into MMDSS. Since all of the MMDs and their component media objects of different modalities lie in MMDSS, cross-media retrieval can be easily performed. Experiment results are encouraging and indicate that the performance of the proposed approach is effective.