Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Learning a ranking from pairwise preferences
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation by comparing result sets in context
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Emotion-based impressionism slideshow with automatic music accompaniment
Proceedings of the 15th international conference on Multimedia
Evaluation measures for preference judgments
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Towards methods for the collective gathering and quality control of relevance assessments
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Financial incentives and the "performance of crowds"
Proceedings of the ACM SIGKDD Workshop on Human Computation
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Ranking with partial orders and pairwise comparisons
RSKT'08 Proceedings of the 3rd international conference on Rough sets and knowledge technology
Here or there: preference judgments for relevance
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Do user preferences and evaluation measures line up?
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A methodology for evaluating aggregated search results
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Design and implementation of relevance assessments using crowdsourcing
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Picasso - to sing, you must close your eyes and draw
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Crowdsourcing assessments for XML ranked retrieval
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
SRbench--a benchmark for soundtrack recommendation systems
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
SRbench--a benchmark for soundtrack recommendation systems
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
In this work, a benchmark to evaluate the retrieval performance of soundtrack recommendation systems is proposed. Such systems aim at finding songs that are played as background music for a given set of images. The proposed benchmark is based on preference judgments, where relevance is considered a continuous ordinal variable and judgments are collected for pairs of songs with respect to a query (i.e., set of images). To capture a wide variety of songs and images, we use a large space of possible music genres, different emotions expressed through music, and various query-image themes. The benchmark consists of two types of relevance assessments: (i) judgments obtained from a user study, that serve as a ``gold standard'' for (ii) relevance judgments gathered through Amazon's Mechanical Turk. We report on the performance of two state-of-the-art soundtrack recommendation systems using the proposed benchmark.