Music identification via vocabulary tree with MFCC peaks

Authors:
Tianjing Xu;Adams Wei Yu;Xianglong Liu;Bo Lang
Affiliations:
State Key Laboratory of Software Development Environment, Beihang University, Beijing, China;State Key Laboratory of Software Development Environment, Beihang University, Beijing, China;State Key Laboratory of Software Development Environment, Beihang University, Beijing, China;State Key Laboratory of Software Development Environment, Beihang University, Beijing, China
Venue:
MIRUM '11 Proceedings of the 1st international ACM workshop on Music information retrieval with user-centered and multimodal strategies
Year:
2011

Citing 9
Cited 1

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Query by humming: musical information retrieval in an audio database

Proceedings of the third ACM international conference on Multimedia
Flexible pattern matching in strings: practical on-line search algorithms for texts and biological sequences

Flexible pattern matching in strings: practical on-line search algorithms for texts and biological sequences
Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Computer Vision for Music Identification

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
A Review of Audio Fingerprinting

Journal of VLSI Signal Processing Systems
Scalable Recognition with a Vocabulary Tree

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Efficient and robust music identification with weighted finite-state transducers

IEEE Transactions on Audio, Speech, and Language Processing
Robust audio identification for MP3 popular music

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

1st international ACM workshop on music information retrieval with user-centered and multimodal strategies (MIRUM)

MM '11 Proceedings of the 19th ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, a Vocabulary Tree based framework is proposed for music identification whose target is to recognize a fragment from a song database. The key to a high recognition precision within this framework is a novel feature, namely MFCC Peaks, which is a combination of MFCC and Spectral Peaks features. Our approach consists of three stages. We first build the Vocabulary Tree with 2 million MFCC Peaks features extracted from hundreds of music. Then each song in the database is quantified into some words by traveling from root down to a certain leaf. Given a query input, we apply the same quantization procedure to this fragment, score the archive according to the TF-IDF scheme and return the best matches. The experimental results demonstrate that our proposed feature has strong identifying and generalization ability. Other trials show that our approach scales well with the size of database. Further comparison also demonstrates that while our algorithm achieves approximately the same retrieval precision as other state-of-the-art methods, it cost less time and memory.