Parallel algorithms for mining large-scale rich-media data

  • Authors:
  • Edward Y. Chang;Hongjie Bai;Kaihua Zhu

  • Affiliations:
  • Google Research, Beijing, China;Google Research, Beijing, China;Google Research, Beijing, China

  • Venue:
  • MM '09 Proceedings of the 17th ACM international conference on Multimedia
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The amount of online photos and videos is now at the scale of tens of billions. To organize, index, and retrieve these large-scale rich-media data, a system must employ scalable data management and mining algorithms. The research community needs to consider solving large scale problems rather than solving problems with small datasets that do not reflect real life scenarios. This tutorial introduces key challenges in large-scale rich-media data mining, and presents parallel algorithms for tackling such challenges. We present our parallel implementations of Spectral Clustering (PSC), FP-Growth (PFP), Latent Dirichlet Allocation (PLDA), and Support Vector Machines (PSVM).