Query by humming: musical information retrieval in an audio database
Proceedings of the third ACM international conference on Multimedia
Towards the digital music library: tune retrieval from acoustic input
Proceedings of the first ACM international conference on Digital libraries
Survey of the state of the art in human language technology
Survey of the state of the art in human language technology
A tool for content based navigation of music
MULTIMEDIA '98 Proceedings of the sixth ACM international conference on Multimedia
Musical content-based retrieval: an overview of the Melodiscov approach and system
MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
A practical query-by-humming system for a large music database
MULTIMEDIA '00 Proceedings of the eighth ACM international conference on Multimedia
A comparison of melodic database retrieval techniques using sung queries
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Tune Retrieval in the Multimedia Library
Multimedia Tools and Applications
Name that tune: a pilot study in finding a melody from a sung query
Journal of the American Society for Information Science and Technology
A statistical approach to retrieval under user-dependent uncertainty in query-by-humming systems
Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval
A novel approach based on fault tolerance and recursive segmentation to query by humming
AST/UCMA/ISA/ACN'10 Proceedings of the 2010 international conference on Advances in computer science and information technology
Hi-index | 0.00 |
Transcription from audio to musical representation is a challenging problem for Query by Humming (QBH) systems. In this paper, we propose a two step note transcription process consisting of an algorithm that uses a speech recognizer for note segmentation followed by signal processing for robust location and capture of pitch and duration in the humming audio input. In contrast to most Hidden Markov Model based approaches to QBH systems that model and classify humming into a single universal model, we designed a flexible speech recognizer that allows different types of humming sounds in the input for providing efficient and accurate note segmentation and transcription. We use novel statistical energy and pitch analyses to correct potential insertion and deletion errors to augment the system's performance, and evaluate our algorithm with precision and recall tests. Using a large database previously amassed, we test various system configurations, providing results for note segmentation with and without the proposed augmentations. The augmented system robustly recognizes the location of humming notes with a precision and recall F measure of 0.84. As a second validation, we use the results of the transcription system in melody retrieval and found, for a database of 1000 melodies, a 76% retrieval accuracy with automatically extracted queries, and a 83% retrieval performance with manually transcribed queries. Sensitivity analysis shows that, while it is possible to locate the position of the hummed notes accurately, incorrect segmentation results can have a negative effect in the retrieval performance of the QBH system.