A psychophysically plausible model for typicality ranking of natural scenes

  • Authors:
  • Adrian Schwaninger;Julia Vogel;Franziska Hofer;Bernt Schiele

  • Affiliations:
  • Max Planck Institute for Biological Cybernetics, Tübingen, Germany and University of Zurich, Zurich, Switzerland;Max Planck Institute for Biological Cybernetics, Tübingen, Germany and University of British Columbia, Vancouver, Canada;University of Zurich, Zurich, Switzerland;Darmstadt University of Technology, Darmstadt, Germany

  • Venue:
  • ACM Transactions on Applied Perception (TAP)
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Natural scenes constitute a very heterogeneous stimulus class. Each semantic category contains exemplars of varying typicality. It is, therefore, an interesting question whether humans can categorize natural scenes consistently into a relatively small number of categories, such as, coasts, rivers/lakes, forests, plains, and mountains. This is particularly important for applications, such as, image retrieval systems. Only if typicality is consistently perceived across different individuals, a general image-retrieval system makes sense. In this study, we use psychophysics and computational modeling to gain a deeper understanding of scene typicality. In the first psychophysical experiment, we used a forced-choice categorization task in which each of 250 natural scenes had to be classified into one of the following five categories: coasts, rivers/lakes, forests, plains, and mountains. In the second experiment, the typicality of each scene had to be rated on a 50-point scale for each of the five categories. The psychophysical results show high consistency between participants not only in the categorization of natural scenes, but also in the typicality ratings. In order to model human perception, we then employ a computational approach that uses an intermediate semantic modeling step by extracting local semantic concepts, such as, rock, water, and sand. Based on the human typicality ratings, we learn a psychophysically plausible distance measure that leads to a high correlation between the computational and the human ranking of natural scenes. Interestingly, model comparisons without a semantic-modeling step correlated much less with human performance, suggesting that our model is psychophysically very plausible.