Toward content-aware multimodal tagging of personal photo collections

  • Authors:
  • Paulo Barthelmess;Edward Kaiser;David R. McGee

  • Affiliations:
  • Adapx: Inc, Seattle, WA;Adapx: Inc, Seattle, WA;Adapx: Inc, Seattle, WA

  • Venue:
  • Proceedings of the 9th international conference on Multimodal interfaces
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

A growing number of tools is becoming available, that make use ofexisting tags to help organize and retrieve photos, facilitating the management and use of photo sets. The tagging on which these techniques rely remains a time consuming, labor intensive task that discourages many users. To address this problem, we aim to leverage the multimodal content of naturally occurring photo discussions among friends and families to automatically extract tags from a combination of conversational speech, handwriting, and photo content analysis. While naturally occurring discussions are rich sources of informationabout photos, methods need to be developed to reliably extract a set of discriminative tags from this noisy, unconstrained group discourse. To this end, this paper contributes ananalysis of pilot data identifying robust multimodal features examining the interplay between photo content and other modalities such as speech and handwriting. Our analysis is motivated by a search for design implications leading to the effective incorporation of automated location and person identification(e.g. based on GPS and facial recognition technologies) into a system able to extract tags from natural multimodal conversations.