Large-scale content-based audio retrieval from text queries
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Combining audio content and social context for semantic music discovery
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
User-centered design of a social game to tag music
Proceedings of the ACM SIGKDD Workshop on Human Computation
ICEC '09 Proceedings of the 8th International Conference on Entertainment Computing
Exposing parameters of a trained dynamic model for interactive music creation
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Improving automatic music tag annotation using stacked generalization of probabilistic SVM outputs
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Music information retrieval using social tags and audio
IEEE Transactions on Multimedia - Special section on communities and media computing
A divide-and-conquer approach to latent perceptual indexing of audio for large web 2.0 applications
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
On the use of anti-word models for audio music annotation and retrieval
IEEE Transactions on Audio, Speech, and Language Processing
Modeling music as a dynamic texture
IEEE Transactions on Audio, Speech, and Language Processing
Effective music tagging through advanced statistical modeling
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Sound retrieval and ranking using sparse auditory representations
Neural Computation
Large-scale music tag recommendation with explicit multiple attributes
Proceedings of the international conference on Multimedia
Learning to tag from open vocabulary labels
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Audio tag annotation and retrieval using tag count information
MMM'11 Proceedings of the 17th international conference on Advances in multimedia modeling - Volume Part I
Learning Multi-modal Similarity
The Journal of Machine Learning Research
An ontological framework for retrieving environmental sounds using semantics and acoustic content
EURASIP Journal on Audio, Speech, and Music Processing - Special issue on environmental sound synthesis, processing, and retrieval
Music classification via the bag-of-features approach
Pattern Recognition Letters
Colorizing tags in tag cloud: a novel query-by-tag music search system
MM '11 Proceedings of the 19th ACM international conference on Multimedia
The need for music information retrieval with user-centered and multimodal strategies
MIRUM '11 Proceedings of the 1st international ACM workshop on Music information retrieval with user-centered and multimodal strategies
Machine Recognition of Music Emotion: A Review
ACM Transactions on Intelligent Systems and Technology (TIST)
A Probabilistic Model to Combine Tags and Acoustic Similarity for Music Retrieval
ACM Transactions on Information Systems (TOIS)
Content-based music access: an approach and its applications
FDIA'09 Proceedings of the Third BCS-IRSG conference on Future Directions in Information Access
Active learning of custom sound taxonomies in unstructured audio data
Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Modeling concept dynamics for large scale music search
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Bilingual analysis of song lyrics and audio words
Proceedings of the 20th ACM international conference on Multimedia
Cross matching of music and image
Proceedings of the 20th ACM international conference on Multimedia
International Journal of Online Pedagogy and Course Design
Multi-label classification by exploiting label correlations
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
We present a computer audition system that can both annotate novel audio tracks with semantically meaningful words and retrieve relevant tracks from a database of unlabeled audio content given a text-based query. We consider the related tasks of content-based audio annotation and retrieval as one supervised multiclass, multilabel problem in which we model the joint probability of acoustic features and words. We collect a data set of 1700 human-generated annotations that describe 500 Western popular music tracks. For each word in a vocabulary, we use this data to train a Gaussian mixture model (GMM) over an audio feature space. We estimate the parameters of the model using the weighted mixture hierarchies expectation maximization algorithm. This algorithm is more scalable to large data sets and produces better density estimates than standard parameter estimation techniques. The quality of the music annotations produced by our system is comparable with the performance of humans on the same task. Our ldquoquery-by-textrdquo system can retrieve appropriate songs for a large number of musically relevant words. We also show that our audition system is general by learning a model that can annotate and retrieve sound effects.