The picture says it all!: multimodal interactions and interaction metadata

  • Authors:
  • Ramadevi Vennelakanti;Prasenjit Dey;Ankit Shekhawat;Phanindra Pisupati

  • Affiliations:
  • Hewlett-Packard Labs, Bangalore, India;Hewlett-Packard Labs, Bangalore, India;Hewlett-Packard Labs, Bangalore, India;Hewlett-Packard Labs, Bangalore, India

  • Venue:
  • ICMI '11 Proceedings of the 13th international conference on multimodal interfaces
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

People share photographs with family and friends! This inclination to share photographs lends itself to many occasions of co-present sharing resulting in interesting interactions, discussions, and experiences among those present. These interactions, are rich in information about the context and the content of the photograph and if extracted can be used to associate metadata with the photograph. However these are rarely captured and so, are lost at the end of the co-present photo sharing session. Most current work on extracting implicit metadata focuses on Content metadata - analyzing the content in a photograph and Object metadata that is automatically generated and consists of data like GPS location, date and time etc. We address the capture of another interesting type of implicit metadata, called the "Interaction metadata", from the user's multimodal interactions with the media (here photographs) during co-present sharing. These interactions in the context of photographs contain rich information: who saw it, who said what, what was pointed at when they said it, who did they see it with for how long, how many times and so on; which if captured and analyzed can create interesting memories about the photograph. These will over time, help build stories around photographs, aid storytelling, serendipitous discovery and efficient retrieval among other experiences. Interaction metadata can also help organize photographs better by providing mechanisms for filtering based on, who viewed, most viewed, etc. Interaction metadata provides a hereto under explored implicit metadata type created from interactions with media. We designed and built a system prototype to capture and create interaction metadata. In this paper we describe the prototype and present the findings of a study we carried out to evaluate this prototype. The contribution of our work to the domain of multimodal interactions are: a method of identifying relevant speech portions in a free flowing conversation and the use of natural human interactions in the context of media to create Interaction Metadata, a novel type of implicit metadata.