Gaze- and speech-enhanced content-based image retrieval in image tagging

  • Authors:
  • He Zhang;Teemu Ruokolainen;Jorma Laaksonen;Christina Hochleitner;Rudolf Traunmüller

  • Affiliations:
  • Department of Information and Computer Science, Aalto University School of Science, Espoo, Finland;Department of Information and Computer Science, Aalto University School of Science, Espoo, Finland;Department of Information and Computer Science, Aalto University School of Science, Espoo, Finland;Celum Gmbh., Linz, Austria;Celum Gmbh., Linz, Austria

  • Venue:
  • ICANN'11 Proceedings of the 21st international conference on Artificial neural networks - Volume Part II
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe a setup and experiments where users are checking and correcting image tags given by an automatic tagging system. We study how much the application of a content-based image retrieval (CBIR) method speeds up the process of finding and correcting the erroneously-tagged images. We also analyze the use of implicit relevance feedback from the user's gaze tracking patterns as a method for boosting up the CBIR performance. Finally, we use automatic speech recognition for giving the correct tags for those images that were wrongly tagged. The experiments show a large variance in the tagging task performance, which we believe is primarily caused by the users' subjectivity in image contents as well as their varying familiarity with the gaze tracking and speech recognition setups. The results suggest potentials for gaze and/or speech enhanced CBIR method in image tagging, at least for some users.