Semantic Annotation and Retrieval of Music and Sound Effects

Authors:
D. Turnbull;L. Barrington;D. Torres;G. Lanckriet
Affiliations:
Dept. of Comput. Sci. & Eng., Univ. of California at San Diego, La Jolla, CA;-;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2008

Citing 0
Cited 29

Large-scale content-based audio retrieval from text queries

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Combining audio content and social context for semantic music discovery

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
User-centered design of a social game to tag music

Proceedings of the ACM SIGKDD Workshop on Human Computation
MusicCommentator: Generating Comments Synchronized with Musical Audio Signals by a Joint Probabilistic Model of Acoustic and Textual Features

ICEC '09 Proceedings of the 8th International Conference on Entertainment Computing
Exposing parameters of a trained dynamic model for interactive music creation

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Improving automatic music tag annotation using stacked generalization of probabilistic SVM outputs

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Music information retrieval using social tags and audio

IEEE Transactions on Multimedia - Special section on communities and media computing
A divide-and-conquer approach to latent perceptual indexing of audio for large web 2.0 applications

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
On the use of anti-word models for audio music annotation and retrieval

IEEE Transactions on Audio, Speech, and Language Processing
Modeling music as a dynamic texture

IEEE Transactions on Audio, Speech, and Language Processing
Effective music tagging through advanced statistical modeling

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Sound retrieval and ranking using sparse auditory representations

Neural Computation
Large-scale music tag recommendation with explicit multiple attributes

Proceedings of the international conference on Multimedia
Learning to tag from open vocabulary labels

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Audio tag annotation and retrieval using tag count information

MMM'11 Proceedings of the 17th international conference on Advances in multimedia modeling - Volume Part I
Learning Multi-modal Similarity

The Journal of Machine Learning Research
An ontological framework for retrieving environmental sounds using semantics and acoustic content

EURASIP Journal on Audio, Speech, and Music Processing - Special issue on environmental sound synthesis, processing, and retrieval
Music classification via the bag-of-features approach

Pattern Recognition Letters
Colorizing tags in tag cloud: a novel query-by-tag music search system

MM '11 Proceedings of the 19th ACM international conference on Multimedia
The need for music information retrieval with user-centered and multimodal strategies

MIRUM '11 Proceedings of the 1st international ACM workshop on Music information retrieval with user-centered and multimodal strategies
Machine Recognition of Music Emotion: A Review

ACM Transactions on Intelligent Systems and Technology (TIST)
A Probabilistic Model to Combine Tags and Acoustic Similarity for Music Retrieval

ACM Transactions on Information Systems (TOIS)
Content-based music access: an approach and its applications

FDIA'09 Proceedings of the Third BCS-IRSG conference on Future Directions in Information Access
Active learning of custom sound taxonomies in unstructured audio data

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Modeling concept dynamics for large scale music search

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Bilingual analysis of song lyrics and audio words

Proceedings of the 20th ACM international conference on Multimedia
Cross matching of music and image

Proceedings of the 20th ACM international conference on Multimedia
An Automatic Mechanism to Recognize and Generate Emotional MIDI Sound Arts Based on Affective Computing Techniques

International Journal of Online Pedagogy and Course Design
Multi-label classification by exploiting label correlations

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a computer audition system that can both annotate novel audio tracks with semantically meaningful words and retrieve relevant tracks from a database of unlabeled audio content given a text-based query. We consider the related tasks of content-based audio annotation and retrieval as one supervised multiclass, multilabel problem in which we model the joint probability of acoustic features and words. We collect a data set of 1700 human-generated annotations that describe 500 Western popular music tracks. For each word in a vocabulary, we use this data to train a Gaussian mixture model (GMM) over an audio feature space. We estimate the parameters of the model using the weighted mixture hierarchies expectation maximization algorithm. This algorithm is more scalable to large data sets and produces better density estimates than standard parameter estimation techniques. The quality of the music annotations produced by our system is comparable with the performance of humans on the same task. Our ldquoquery-by-textrdquo system can retrieve appropriate songs for a large number of musically relevant words. We also show that our audition system is general by learning a model that can annotate and retrieve sound effects.