MSRA-MM 2.0: A Large-Scale Web Multimedia Dataset

  • Authors:
  • Hao Li;Meng Wang;Xian-Sheng Hua

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDMW '09 Proceedings of the 2009 IEEE International Conference on Data Mining Workshops
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we introduce the second version of Microsoft Research Asia Multimedia (MSRA-MM), a dataset that aims to facilitate research in multimedia information retrieval and related areas. The images and videos in the dataset are collected from a commercial search engine with more than 1000 queries. It contains about 1 million images and 20,000 videos. We also provide the surrounding texts that are obtained from more than 1 million web pages. The images and videos have been comprehensively annotated, including their relevance levels to corresponding queries, semantic concepts of images, and category and quality information of videos. We define six standard tasks on the dataset: (1) image search reranking; (2) image annotation; (3) query-by-example image search; (4) video search reranking; (5) video categorization; and (6) video quality assessment.