Warping indexes with envelope transforms for query by humming
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Query by humming: in action with its technology revealed
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Structural Representation of Speech for Phonetic Classification
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Proceedings of the 6th international conference on Multimodal interfaces
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 2
On nonmetric similarity search problems in complex domains
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
Multimedia data has increasingly become a prevalent resource in Digital Library system; this includes audio, video, and image archives. However, each type of these data may need specific tools to help facilitate effective and efficient retrieval tasks. In this paper, we focus on retrieval of speech audio collection, which includes audio books, speech recordings, interviews, and lectures. Currently, most of the audio retrieval systems are based on keyword/title/author search typed into the system by users. The system then searches for particular keywords and gives a list of entire audio files that potentially are relevant to the query. Nonetheless, browsing audio content for particular section of the audios without knowing the actual content is yet a very difficult task. Moreover, since audio transcription or keyword annotation is very labor intensive and becomes infeasible for large data, we introduce here a preliminary framework that locates subsections of the audio that correspond to the voice query made by a user. We demonstrate a utility of our approach on query retrieval tasks in various types of audio recordings. We also show that this simple framework can potentially help retrieve and locate the voice query within the audio accurately and efficiently.