Web media semantic concept retrieval via tag removal and model fusion

  • Authors:
  • Chao Chen;Qiusha Zhu;Lin Lin;Mei-Ling Shyu

  • Affiliations:
  • University of Miami, Coral Gables, FL;University of Miami, Coral Gables, FL;University of Miami, Coral Gables, FL;University of Miami, Coral Gables, FL

  • Venue:
  • ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multimedia data on social websites contain rich semantics and are often accompanied with user-defined tags. To enhance Web media semantic concept retrieval, the fusion of tag-based and content-based models can be used, though it is very challenging. In this article, a novel semantic concept retrieval framework that incorporates tag removal and model fusion is proposed to tackle such a challenge. Tags with useful information can facilitate media search, but they are often imprecise, which makes it important to apply noisy tag removal (by deleting uncorrelated tags) to improve the performance of semantic concept retrieval. Therefore, a multiple correspondence analysis (MCA)-based tag removal algorithm is proposed, which utilizes MCA's ability to capture the relationships among nominal features and identify representative and discriminative tags holding strong correlations with the target semantic concepts. To further improve the retrieval performance, a novel model fusion method is also proposed to combine ranking scores from both tag-based and content-based models, where the adjustment of ranking scores, the reliability of models, and the correlations between the intervals divided on the ranking scores and the semantic concepts are all considered. Comparative results with extensive experiments on the NUS-WIDE-LITE as well as the NUS-WIDE-270K benchmark datasets with 81 semantic concepts show that the proposed framework outperforms baseline results and the other comparison methods with each component being evaluated separately.