MAMI: multimodal annotations on a camera phone

Authors:
Xavier Anguera;Nuria Oliver
Affiliations:
Telefónica Research, Barcelona, Spain;Telefónica Research, Barcelona, Spain
Venue:
Proceedings of the 10th international conference on Human computer interaction with mobile devices and services
Year:
2008

Citing 6
Cited 1

Media Streams: an iconic visual language for video representation

Human-computer interaction
Faceted metadata for image search and browsing

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
How do people manage their digital photographs?

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Universal Access to Mobile Computing Devices through Speech Input

Proceedings of the Twelfth International Florida Artificial Intelligence Research Society Conference
Geographic location tags on digital images

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
Photo annotation on a camera phone

CHI '04 Extended Abstracts on Human Factors in Computing Systems

Multimodal photo annotation and retrieval on a mobile phone

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present MAMI (i.e. Multimodal Automatic Mobile Indexing), a mobile-phone prototype that allows users to annotate and search for digital photos on their camera phone via speech input. MAMI is implemented as a mobile application that runs in real-time on the phone. Users can add speech annotations at the time of capturing photos or at a later time. Additional metadata is also stored with the photos, such as location, user identification, date and time of capture and image-based features. Users can search for photos in their personal repository by means of speech. MAMI does not need connectivity to a server. Hence, instead of full-fledged speech recognition, we propose using a Dynamic Time Warping-based metric to determine the distance between the speech input and all other existing speech annotations. We present our preliminary results with the MAMI prototype and outline our future directions of research, including the integration of additional metadata in the search.