Large-scale crossmedia retrieval for playlist generation and song discovery

Authors:
Filipe Coelho;José Devezas;Cristina Ribeiro
Affiliations:
Universidade do Porto;Universidade do Porto;Universidade do Porto
Venue:
Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
Year:
2013

Citing 6
Cited 0

Measuring playlist diversity for recommendation systems

Proceedings of the 1st ACM workshop on Audio and music computing multimedia
Approximate similarity search in metric spaces using inverted files

Proceedings of the 3rd international conference on Scalable information systems
An approach to content-based image retrieval based on the Lucene search engine library

ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
The need for music information retrieval with user-centered and multimodal strategies

MIRUM '11 Proceedings of the 1st international ACM workshop on Music information retrieval with user-centered and multimodal strategies
Image abstraction in crossmedia retrieval for text illustration

ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Looking Beyond Genres: Identifying Meaningful Semantic Layers from Tags in Online Music Collections

ICMLA '11 Proceedings of the 2011 10th International Conference on Machine Learning and Applications and Workshops - Volume 02

Quantified Score

Hi-index	0.00

Visualization

Abstract

To explore vast collections of audio content, users require automated tools capable of providing music search and recommendation even when faced with large-scale collections. Collaborative-filtering recommenders rely on user-generated information and may be hindered by the lack of users or a bias for certain popular genres, enclosing users in an information bubble. Audio content analysis, on the other hand, is a reliable source of audio similarity, used in tasks such as music classification. For highly interactive tasks, however, the performance of analysis algorithms becomes an issue. In this work, we address the playlist generation and song discovery tasks on large-scale datasets. We generate playlists and explore the collections with example-based queries using audio features, lyrics and tags. Approximate indexing and cross-media reranking are used for efficiency. Audio content is mapped to textual representations that can be handled by information retrieval libraries. We explored the feasibility of this content-based approach in the Million Song Dataset, a large-scale collection of audio features and associated text data comprising almost 300 GB of information. The proposed strategy can be used independently as a content-based music retrieval system and as a component for hybrid recommender systems.