Content redundancy in YouTube and its application to video tagging

  • Authors:
  • Jose San Pedro;Stefan Siersdorfer;Mark Sanderson

  • Affiliations:
  • Telefonica Research, and Penn State University;L3S Research Center;RMIT University

  • Venue:
  • ACM Transactions on Information Systems (TOIS)
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The emergence of large-scale social Web communities has enabled users to share online vast amounts of multimedia content. An analysis of YouTube reveals a high amount of redundancy, in the form of videos with overlapping or duplicated content. We use robust content-based video analysis techniques to detect overlapping sequences between videos. Based on the output of these techniques, we present an in-depth study of duplication and content overlap in YouTube, and analyze various dependencies between content overlap and meta data such as video titles, views, video ratings, and tags. As an application, we show that content-based links provide useful information for generating new tag assignments. We propose different tag propagation methods for automatically obtaining richer video annotations. Experiments on video clustering and classification as well as a user evaluation demonstrate the viability of our approach.