A Method for Photograph Indexing Using Speech Annotation

  • Authors:
  • Jiayi Chen;Tele Tan;Philippe Mulhem

  • Affiliations:
  • -;-;-

  • Venue:
  • PCM '01 Proceedings of the Second IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

We explore the feasibility of using speech input to perform the task of indexing a large volume of digital photographs. As a natural medium for image communication, speech can be used to complement existing contentbased techniques thereby promoting the reliability and use-ability of image retrieval systems. We introduce a methodology for image indexing using speech annotation technique. Speech recognition tools, like Dragon NaturallySpeaking can be adapted to perform the main role of speech-to-text transcription. The use of structured speech as opposed to free form speech in a limited system can further boost the transcription accuracy. We also introduce the idea of using N-best lists from the speech recognition output to improve the recognition performance. The transcribed text is used to populate the metadata of the corresponding photograph. A photo query strategy is implemented to affirm the performance of proposed technique for photo indexing and retrieval.