Pornography detection in video benefits (a lot) from a multi-modal approach

  • Authors:
  • Adrian Ulges;Christian Schulze;Damian Borth;Armin Stahl

  • Affiliations:
  • German Research Center for Artificial Intelligence (DFKI) GmbH, Kaiserslautern, Germany;German Research Center for Artificial Intelligence (DFKI) GmbH, Kaiserslautern, Germany;German Research Center for Artificial Intelligence (DFKI) GmbH, Kaiserslautern, Germany;German Research Center for Artificial Intelligence (DFKI) GmbH, Kaiserslautern, Germany

  • Venue:
  • Proceedings of the 2012 ACM international workshop on Audio and multimedia methods for large-scale video analysis
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We address the challenge of detecting pornographic content in video streams. On offensive material crawled from different pornographic websites and non-offensive clips from YouTube (a total of 500 hours of video), we first study a compressed-domain activity descriptor based on MPEG motion compensation vectors. We show that the approach offers an interesting alternative but generalizes poorly between videos compressed with different codecs, a problem that can be overcome to some extent by adding noise to the image data prior to video compression. Our main contribution is an evaluation that benchmarks the above motion-based descriptor as well as three other widely used features (audio-based MFCC features, skin color detection, and visual words). Here, we show that a multi-modal approach is a key strategy for an accurate detection or adult content: A combination of the different features gives considerable improvements in accuracy, reducing equal error by 36-56% compared to the best uni-modal system.