Effective music tagging through advanced statistical modeling

Authors:
Jialie Shen;Wang Meng;Shuichang Yan;HweeHwa Pang;Xiansheng Hua
Affiliations:
SMU, Singapore, Singapore;Microsoft Research, Beijing, China;NUS, Singapore, Singapore;SMU, Singapore, Singapore;Microsoft Research, Beijing, China
Venue:
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Year:
2010

Citing 15
Cited 5

Modern Information Retrieval

Modern Information Retrieval
Guest Editors' Introduction: Bridging the Semantic Gap with Computational Media Aesthetics

IEEE MultiMedia
A comparative study on content-based music genre classification

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Introduction to Machine Learning (Adaptive Computation and Machine Learning)

Introduction to Machine Learning (Adaptive Computation and Machine Learning)
Learning the meaning of music

Learning the meaning of music
Towards efficient automated singer identification in large music databases

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Music retrieval: a tutorial and review

Foundations and Trends in Information Retrieval
Towards musical query-by-semantic-description using the CAL500 data set

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
SVM optimization: inverse dependence on training set size

Proceedings of the 25th international conference on Machine learning
A novel framework for efficient automated singer identification in large music databases

ACM Transactions on Information Systems (TOIS)
Combining audio content and social context for semantic music discovery

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
CompositeMap: a novel framework for music similarity measure

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Semantic Annotation and Retrieval of Music and Sound Effects

IEEE Transactions on Audio, Speech, and Language Processing
Towards Effective Content-Based Music Retrieval With Multiple Acoustic Feature Combination

IEEE Transactions on Multimedia

Multimedia tagging: past, present and future
Modeling concept dynamics for large scale music search

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Assessing the quality of textual features in social media

Information Processing and Management: an International Journal
A query by humming system based on locality sensitive hashing indexes

Signal Processing
A survey of music similarity and recommendation from music context data

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Music information retrieval (MIR) holds great promise as a technology for managing large music archives. One of the key components of MIR that has been actively researched into is music tagging. While significant progress has been achieved, most of the existing systems still adopt a simple classification approach, and apply machine learning classifiers directly on low level acoustic features. Consequently, they suffer the shortcomings of (1) poor accuracy, (2) lack of comprehensive evaluation results and the associated analysis based on large scale datasets, and (3) incomplete content representation, arising from the lack of multimodal and temporal information integration. In this paper, we introduce a novel system called MMTagger that effectively integrates both multimodal and temporal information in the representation of music signal. The carefully designed multilayer architecture of the proposed classification framework seamlessly combines Multiple Gaussian Mixture Models (GMMs) and Support Vector Machine (SVM) into a single framework. The structure preserves more discriminative information, leading to more accurate and robust tagging. Experiment results obtained with two large music collections highlight the various advantages of our multilayer framework over state of the art techniques.