Classifying offensive sites based on image content

  • Authors:
  • Will Archer Arentz;Bjørn Olstad

  • Affiliations:
  • Department of Computer and Information Science, Norwegian University of Science and Technology, NO-749l Trondheim, Norway;Department of Computer and Information Science, Norwegian University of Science and Technology, NO-749l Trondheim, Norway

  • Venue:
  • Computer Vision and Image Understanding - Special issue on color for image indexing and retrieval
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a method for helping to identify adult web sites by using the imagecontent as means of detecting erotic material. The image content is classified by investigating probable skin-regions, and extracting their feature vectors. These feature vectors are based on color-, texture-, contour-, placement-, and relative size-information for a given region. The importance of the different elements in the feature vector is determined by a genetic algorithm. For each picture, the algorithm gives the probability that a certain picture has erotic content. By mapping all the images in a web site, and running the image-based classifier on the whole collection, we were able to set up a histogram of images with regards to the log-likelihood of erotic content for each image. Hence giving a good overview of the web site's content and at the same time leaving room for errors in the image-based classifier.The algorithm proved to be quite successful in our tests where all 20 sites where classified correctly. The image-based classifier is able to properly identify 89% of the evaluation images at an average processing speed of 11 images per second.Although this experiment focused on classifying adult web sites, small alterations to the system can be done, enabling classification of other kinds of images and web sites.