Collaborative multimodal photo annotation over digital paper

Authors:
Paulo Barthelmess;Edward Kaiser;Xiao Huang;David McGee;Philip Cohen
Affiliations:
Natural Interaction Systems, Seattle, WA;Natural Interaction Systems, Seattle, WA;Natural Interaction Systems, Seattle, WA;Natural Interaction Systems, Seattle, WA;Natural Interaction Systems, Seattle, WA
Venue:
Proceedings of the 8th international conference on Multimodal interfaces
Year:
2006

Citing 17
Cited 8

QuickSet: multimodal interaction for distributed applications

MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Mutual disambiguation of recognition errors in a multimodel architecture

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Creating tangible interfaces by augmenting physical objects with multimodal language

Proceedings of the 6th international conference on Intelligent user interfaces
Requirements for photoware

CSCW '02 Proceedings of the 2002 ACM conference on Computer supported cooperative work
Show&Tell: A Semi-Automated Image Annotation System

IEEE MultiMedia
Aria: An Agent for Annotating and Retrieving Images

Computer
Personal digital historian: story sharing around the table

interactions - Winds of change
How do people manage their digital photographs?

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Digital annotation of printed documents

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Tangible multimodal interfaces for safety-critical applications

Communications of the ACM - Multimodal interfaces that flex, adapt, and persist
Unification-based multimodal integration

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A study of digital ink in lecture presentation

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Exploring the potentials of combining photo annotating tasks with instant messaging fun

Proceedings of the 3rd international conference on Mobile and ubiquitous multimedia
Leveraging context to resolve identity in photo albums

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Distributed pointing for multimodal collaboration over sketched diagrams

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
ButterflyNet: a mobile capture and access system for field biology research

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Using redundant speech and handwriting for learning new vocabulary and understanding abbreviations

Proceedings of the 8th international conference on Multimodal interfaces

Human-centered collaborative interaction

Proceedings of the 1st ACM international workshop on Human-centered multimedia
Collaborative multimodal photo annotation over digital paper

Proceedings of the 8th international conference on Multimodal interfaces
Multimodal redundancy across handwriting and speech during computer mediated human-human interactions

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Toward content-aware multimodal tagging of personal photo collections

Proceedings of the 9th international conference on Multimodal interfaces
Cross-domain matching for automatic tag extraction across redundant handwriting and speech events

Proceedings of the 2007 workshop on Tagging, mining and retrieval of human related activity information
HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces

HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces
Espace de caractérisation du stylo numérique

Proceedings of the 20th International Conference of the Association Francophone d'Interaction Homme-Machine
Social tagging revamped: supporting the users' need of self-promotion through persuasive techniques

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The availability of metadata annotations over media content such as photos is known to enhance retrieval and organization, particularly for large data sets. The greatest challenge for obtaining annotations remains getting users to perform the large amount of tedious manual work that is required.In this paper we introduce an approach for semi-automated labeling based on extraction of metadata from naturally occurring conversations of groups of people discussing pictures among themselves.As the burden for structuring and extracting metadata is shifted from users to the system, new recognition challenges arise. We explore how multimodal language can help in 1) detecting a concise set of meaningful labels to be associated with each photo, 2) achieving robust recognition of these key semantic terms, and 3) facilitating label propagation via multimodal shortcuts. Analysis of the data of a preliminary pilot collection suggests that handwritten labels may be highly indicative of the semantics of each photo, as indicated by the correlation of handwritten terms with high frequency spoken ones. We point to initial directions exploring a multimodal fusion technique to recover robust spelling and pronunciation of these high-value terms from redundant speech and handwriting.