FotoFile: a consumer multimedia organization and retrieval system
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Name-It: Naming and Detecting Faces in News Videos
IEEE MultiMedia
Content-Based Indexing of Multimedia Databases
IEEE Transactions on Knowledge and Data Engineering
Integration of continuous speech recognition and information retrieval for mutually optimal performance
SmartAlbum: a multi-modal photo annotation system
Proceedings of the tenth ACM international conference on Multimedia
Text versus speech: a comparison of tagging input modalities for camera phones
Proceedings of the 11th International Conference on Human-Computer Interaction with Mobile Devices and Services
Attribute and object selection queries on objects with probabilistic attributes
ACM Transactions on Database Systems (TODS)
Hi-index | 0.00 |
We explore the feasibility of using speech input to perform the task of indexing a large volume of digital photographs. As a natural medium for image communication, speech can be used to complement existing contentbased techniques thereby promoting the reliability and use-ability of image retrieval systems. We introduce a methodology for image indexing using speech annotation technique. Speech recognition tools, like Dragon NaturallySpeaking can be adapted to perform the main role of speech-to-text transcription. The use of structured speech as opposed to free form speech in a limited system can further boost the transcription accuracy. We also introduce the idea of using N-best lists from the speech recognition output to improve the recognition performance. The transcribed text is used to populate the metadata of the corresponding photograph. A photo query strategy is implemented to affirm the performance of proposed technique for photo indexing and retrieval.