A comparison of speech and typed input
HLT '90 Proceedings of the workshop on Speech and Natural Language
Open-vocabulary speech indexing for voice and video mail retrieval
MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
FotoFile: a consumer multimedia organization and retrieval system
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Multimedia indexing and retrieval
ACM SIGIR Forum
Managing photos with AT&T Shoebox (demonstration session)
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Using event segmentation to improve indexing of consumer photographs
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
SmartAlbum: a multi-modal photo annotation system
Proceedings of the tenth ACM international conference on Multimedia
How do people manage their digital photographs?
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A Method for Photograph Indexing Using Speech Annotation
PCM '01 Proceedings of the Second IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Metadata creation system for mobile images
Proceedings of the 2nd international conference on Mobile systems, applications, and services
The Ubiquitous Camera: An In-Depth Study of Camera Phone Use
IEEE Pervasive Computing
Mode preference in a simple data-retrieval task
HLT '93 Proceedings of the workshop on Human Language Technology
Why we tag: motivations for annotation in mobile and online media
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Tlk or txt? Using voice input for SMS composition
Personal and Ubiquitous Computing
Search Vox: leveraging multimodal refinement and partial knowledge for mobile voice search
Proceedings of the 21st annual ACM symposium on User interface software and technology
Multimodal photo annotation and retrieval on a mobile phone
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
A Study in Efficiency and Modality Usage in Multimodal Form Filling Systems
IEEE Transactions on Audio, Speech, and Language Processing
A comparison of speech and GUI input for navigation in complex visualizations on mobile devices
Proceedings of the 12th international conference on Human computer interaction with mobile devices and services
Hi-index | 0.00 |
Speech and typed text are two common input modalities for mobile phones. However, little research has compared them in their ability to support annotation and retrieval of digital pictures on mobile devices. In this paper, we report the results of a month-long field study in which participants took pictures with their camera phones and had the choice of adding annotations using speech, typed text, or both. Subsequently, the same subjects participated in a controlled experiment where they were asked to retrieve images based on annotations as well as retrieve annotations based on images in order to study the ability of each modality to effectively support users' recall of the previously captured pictures. Results demonstrate that each modality has advantages and shortcomings for the production of tags and retrieval of pictures. Several guidelines are suggested when designing tagging applications for portable devices.