Challenging Uncertainty in Query by Humming Systems: A Fingerprinting Approach

Authors:
E. Unal;E. Chew;P. G. Georgiou;S. S. Narayanan
Affiliations:
Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA;-;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2008

Citing 0
Cited 8

Benchmarking dynamic time warping for music retrieval

Proceedings of the 3rd International Conference on PErvasive Technologies Related to Assistive Environments
A novel approach based on fault tolerance and recursive segmentation to query by humming

AST/UCMA/ISA/ACN'10 Proceedings of the 2010 international conference on Advances in computer science and information technology
Model-based search in large time series databases

Proceedings of the 4th International Conference on PErvasive Technologies Related to Assistive Environments
Hum-a-song: a subsequence matching with gaps-range-tolerances query-by-humming system

Proceedings of the VLDB Endowment
An FFT-based fast melody comparison method for query-by-singing/humming systems

Pattern Recognition Letters
A survey of query-by-humming similarity methods

Proceedings of the 5th International Conference on PErvasive Technologies Related to Assistive Environments
A query by humming system based on locality sensitive hashing indexes

Signal Processing
Genre classification of symbolic music with SMBGT

Proceedings of the 6th International Conference on PErvasive Technologies Related to Assistive Environments

Quantified Score

Hi-index	0.00

Visualization

Abstract

Robust data retrieval in the presence of uncertainty is a challenging problem in multimedia information retrieval. In query-by-humming (QBH) systems, uncertainty can arise in query formulation due to user-dependent variability, such as incorrectly hummed notes, and in query transcription due to machine-based errors, such as insertions and deletions. We propose a fingerprinting (FP) algorithm for representing salient melodic information so as to better compare potentially noisy voice queries with target melodies in a database. The FP technique is employed in the QBH system back end; a hidden Markov model (HMM) front end segments and transcribes the hummed audio input into a symbolic representation. The performance of the FP search algorithm is compared to the conventional edit distance (ED) technique. Our retrieval database is built on 1500 MIDI files and evaluated using 400 hummed samples from 80 people with different musical backgrounds. A melody retrieval accuracy of 88% is demonstrated for humming samples from musically trained subjects, and 70% for samples from untrained subjects, for the FP algorithm. In contrast, the widely used ED method achieves 86% and 62% accuracy rates, respectively, for the same samples, thus suggesting that the proposed FP technique is more robust under uncertainty, particularly for queries by musically untrained users.