Statistical methods for speech recognition
Statistical methods for speech recognition
MARSYAS: a framework for audio analysis
Organised Sound
MARSYAS: a framework for audio analysis
Organised Sound
Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
IGroup: web image search results clustering
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Towards musical query-by-semantic-description using the CAL500 data set
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A music search engine built upon audio-based and web-based similarity measures
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 17th international conference on World Wide Web
Novelty and diversity in information retrieval evaluation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Large-scale content-based audio retrieval from text queries
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Structural Segmentation of Musical Audio by Constrained Clustering
IEEE Transactions on Audio, Speech, and Language Processing
Audio Keywords Discovery for Text-Like Audio Content Analysis and Retrieval
IEEE Transactions on Multimedia
Hi-index | 0.00 |
Clustering for better representation of the diversity of text or image search results has been studied extensively. In this paper, we extend this methodology to the novel domain of music search. We conduct empirical evaluation of different clustering algorithms, audio feature representations, and the incorporation of lyrics for music clustering. Our evaluation shows the fusion of audio and text features yields the best clustering accuracy.