K-BOX: a query-by-singing based music retrieval system

Authors:
Dacheng Tao;Hao Liu;Xiaoou Tang
Affiliations:
Chinese University of Hong Kong, Shatin, Hong Kong;Chinese University of Hong Kong, Shatin, Hong Kong;Chinese University of Hong Kong, Shatin, Hong Kong
Venue:
Proceedings of the 12th annual ACM international conference on Multimedia
Year:
2004

Citing 5
Cited 5

Fundamentals of speech recognition

Fundamentals of speech recognition
Query by humming: musical information retrieval in an audio database

Proceedings of the third ACM international conference on Multimedia
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Making large-scale support vector machine learning practical

Advances in kernel methods
Content-based audio classification and retrieval by support vector machines

IEEE Transactions on Neural Networks

QueST: querying music databases by acoustic and textual features

Proceedings of the 15th international conference on Multimedia
Compacting music signatures for efficient music retrieval

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Music recommendation by unified hypergraph: combining social media information and music content

Proceedings of the international conference on Multimedia
Using rich social media information for music recommendation via hypergraph model

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) - Special section on ACM multimedia 2010 best paper candidates, and issue on social media
A self-similarity approach to repairing large dropouts of streamed music

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)

Quantified Score

Hi-index	0.02

Visualization

Abstract

In this paper, we present an efficient query-by-singing based musical retrieval system. We first combine multiple Support Vector Machines by classifier committee learning to segment the sentences from a song automatically. Many new methods in manipulating Mel-Frequency Cepstral Coefficient (MFCC) matrix are studied and compared for optimal feature selection. Experiments show that the 3rd coefficient is the most relevant to music comparison out of 13 coefficients and the proposed simplified MFCC feature is able to achieve a reasonable trade-off between accuracy and efficiency. To improve system efficiency, we re-organize the database by a new two-stage clustering scheme in both time space and feature space. We combine K-means algorithm and dynamic time wrapping similarity measurement for feature space clustering. We also propose a new method for model-selection of K-means algorithm. Experiments show that the proposed approach can achieve more than 30 percent increase in accuracy while speed up more than 16 times in average query time.