QuickSet: multimodal interaction for distributed applications
MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Mutual disambiguation of recognition errors in a multimodel architecture
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Creating tangible interfaces by augmenting physical objects with multimodal language
Proceedings of the 6th international conference on Intelligent user interfaces
CSCW '02 Proceedings of the 2002 ACM conference on Computer supported cooperative work
Show&Tell: A Semi-Automated Image Annotation System
IEEE MultiMedia
Personal digital historian: story sharing around the table
interactions - Winds of change
How do people manage their digital photographs?
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Digital annotation of printed documents
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Tangible multimodal interfaces for safety-critical applications
Communications of the ACM - Multimodal interfaces that flex, adapt, and persist
Unification-based multimodal integration
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A study of digital ink in lecture presentation
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Exploring the potentials of combining photo annotating tasks with instant messaging fun
Proceedings of the 3rd international conference on Mobile and ubiquitous multimedia
Leveraging context to resolve identity in photo albums
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Distributed pointing for multimodal collaboration over sketched diagrams
ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
ButterflyNet: a mobile capture and access system for field biology research
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Using redundant speech and handwriting for learning new vocabulary and understanding abbreviations
Proceedings of the 8th international conference on Multimodal interfaces
Human-centered collaborative interaction
Proceedings of the 1st ACM international workshop on Human-centered multimedia
Collaborative multimodal photo annotation over digital paper
Proceedings of the 8th international conference on Multimodal interfaces
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Toward content-aware multimodal tagging of personal photo collections
Proceedings of the 9th international conference on Multimodal interfaces
Cross-domain matching for automatic tag extraction across redundant handwriting and speech events
Proceedings of the 2007 workshop on Tagging, mining and retrieval of human related activity information
HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces
HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces
Espace de caractérisation du stylo numérique
Proceedings of the 20th International Conference of the Association Francophone d'Interaction Homme-Machine
Social tagging revamped: supporting the users' need of self-promotion through persuasive techniques
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Hi-index | 0.00 |
The availability of metadata annotations over media content such as photos is known to enhance retrieval and organization, particularly for large data sets. The greatest challenge for obtaining annotations remains getting users to perform the large amount of tedious manual work that is required.In this paper we introduce an approach for semi-automated labeling based on extraction of metadata from naturally occurring conversations of groups of people discussing pictures among themselves.As the burden for structuring and extracting metadata is shifted from users to the system, new recognition challenges arise. We explore how multimodal language can help in 1) detecting a concise set of meaningful labels to be associated with each photo, 2) achieving robust recognition of these key semantic terms, and 3) facilitating label propagation via multimodal shortcuts. Analysis of the data of a preliminary pilot collection suggests that handwritten labels may be highly indicative of the semantics of each photo, as indicated by the correlation of handwritten terms with high frequency spoken ones. We point to initial directions exploring a multimodal fusion technique to recover robust spelling and pronunciation of these high-value terms from redundant speech and handwriting.