Content-Based Multimedia Retrieval Using Feature Correlation Clustering and Fusion

  • Authors:
  • Shu-Ching Chen;Hsin-Yu Ha;Fausto C. Fleites

  • Affiliations:
  • School of Computing and Information Sciences, Florida International University, Miami, FL, USA;School of Computing and Information Sciences, Florida International University, Miami, FL, USA;School of Computing and Information Sciences, Florida International University, Miami, FL, USA

  • Venue:
  • International Journal of Multimedia Data Engineering & Management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nowadays, only processing visual features is not enough for multimedia semantic retrieval due to the complexity of multimedia data, which usually involve a variety of modalities, e.g. graphics, text, speech, video, etc. It becomes crucial to fully utilize the correlation between each feature and the target concept, the feature correlation within modalities, and the feature correlation across modalities. In this paper, the authors propose a Feature Correlation Clustering-based Multi-Modality Fusion Framework FCC-MMF for multimedia semantic retrieval. Features from different modalities are combined into one feature set with the same representation via a normalization and discretization process. Within and across modalities, multiple correspondence analysis is utilized to obtain the correlation between feature-value pairs, which are then projected onto the two principal components. K-medoids algorithm, which is a widely used partitioned clustering algorithm, is selected to minimize the Euclidean distance within the resulted clusters and produce high intra-correlated feature-value pair clusters. Majority vote is applied to subsequently decide which cluster each feature belongs to. Once the feature clusters are formed, one classifier is built and trained for each cluster. The correlation and confidence of each classifier are considered while fusing the classification scores, and mean average precision is used to evaluate the final ranked classification scores. Finally, the proposed framework is applied on NUS-wide Lite data set to demonstrate the effectiveness in multimedia semantic retrieval.