Riding the multimedia big data wave

  • Authors:
  • John R. Smith

  • Affiliations:
  • IBM T. J. Watson Research Center, Yorktown Heights, NY, USA

  • Venue:
  • Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this talk we present a perspective across multiple industry problems, including safety and security, medical, Web, social and mobile media, and motivate the need for large-scale analysis and retrieval of multimedia data. We describe a multi-layer architecture that incorporates capabilities for audio-visual feature extraction, machine learning and semantic modeling and provides a powerful framework for learning and classifying contents of multimedia data. We discuss the role semantic ontologies for representing audio-visual concepts and relationships, which are essential for training semantic classifiers. We discuss the importance of using faceted classification schemes in particular for organizing multimedia semantic concepts in order to achieve effective learning and retrieval. We also show how training and scoring of multimedia semantics can be implemented on big data distributed computing platforms to address both massive-scale analysis and low-latency processing. We describe multiple efforts at IBM on image and video analysis and retrieval, including IBM Multimedia Analysis and Retrieval System (IMARS), and show recent results for semantic-based classification and retrieval. We conclude with future directions for improving analysis of multimedia through interactive and curriculum-based techniques for multimedia semantics-based learning and retrieval.