Fusing heterogeneous modalities for video and image re-ranking

Authors:
Hung-Khoon Tan;Chong-Wah Ngo
Affiliations:
City University of Hong Kong, Kowloon, Hong Kong;University of Hong Kong, Kowloon, Hong Kong
Venue:
Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Year:
2011

Citing 25
Cited 2

Analyses of multiple evidence combination

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Rank aggregation methods for the Web

Proceedings of the 10th international conference on World Wide Web
Models for metasearch

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
System Fusion for Improving Performance in Information Retrieval Systems

ITCC '01 Proceedings of the International Conference on Information Technology: Coding and Computing
Web metasearch: rank vs. score based rank aggregation methods

Proceedings of the 2003 ACM symposium on Applied computing
Object Detection Using the Statistics of Parts

International Journal of Computer Vision
Nymble: a high-performance learning name-finder

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Optimal multimodal fusion for multimedia data analysis

Proceedings of the 12th annual ACM international conference on Multimedia
Object-level ranking: bringing order to Web objects

WWW '05 Proceedings of the 14th international conference on World Wide Web
Improving web search results using affinity graph

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Graph based multi-modality learning

Proceedings of the 13th annual ACM international conference on Multimedia
Early versus late fusion in semantic video analysis

Proceedings of the 13th annual ACM international conference on Multimedia
Evaluation campaigns and TRECVid

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Spectral clustering and transductive learning with multiple views

Proceedings of the 24th international conference on Machine learning
Optimizing multi-graph learning: towards a unified video annotation scheme

Proceedings of the 15th international conference on Multimedia
Video search reranking through random walk over document-level context graph

Proceedings of the 15th international conference on Multimedia
Multi-view clustering via canonical correlation analysis

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A generalized Co-HITS algorithm and its application to bipartite graphs

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Heterogeneous source consensus learning via decision propagation and negotiation

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Scalable detection of partial near-duplicate videos by visual-temporal consistency

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Semantic context transfer across heterogeneous sources for domain adaptive video search

MM '09 Proceedings of the 17th ACM international conference on Multimedia
NUS-WIDE: a real-world web image database from National University of Singapore

Proceedings of the ACM International Conference on Image and Video Retrieval
On the sampling of web images for learning visual concept classifiers

Proceedings of the ACM International Conference on Image and Video Retrieval
Coclustering Multiple Heterogeneous Domains: Linear Combinations and Agreements

IEEE Transactions on Knowledge and Data Engineering

Image re-ranking and rank aggregation based on similarity of ranked lists

Pattern Recognition
A heterogenous automatic feedback semi-supervised method for image reranking

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multimedia documents in popular image and video sharing websites such as Flickr and Youtube are heterogeneous documents with diverse ways of representations and rich user-supplied information. In this paper, we investigate how the agreement among heterogeneous modalities can be exploited to guide data fusion. The problem of fusion is cast as the simultaneous mining of agreement from different modalities and adaptation of fusion weights to construct a fused graph from these modalities. An iterative framework based on agreement-fusion optimization is thus proposed. We plug in two well-known algorithms: random walk and semi-supervised learning to this framework to illustrate the idea of how agreement (conflict) is incorporated (compromised) in the case of uniform and adaptive fusion. Experimental results on web video and image re-ranking demonstrate that, by proper fusion strategy rather than simple linear fusion, performance improvement on search can generally be expected.