Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
The Journal of Machine Learning Research
Usage patterns of collaborative tagging systems
Journal of Information Science
LDA-based document models for ad-hoc retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Personalization of tagging systems
Information Processing and Management: an International Journal
Effective web video clustering using playlist information
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Hi-index | 0.00 |
Ranking of resources in social tagging systems is a difficult problem due to the inherent sparsity of the data and the vocabulary problems introduced by having a completely unrestricted lexicon. In this paper we propose to use hidden topic models as a principled way of reducing the dimensionality of this data to provide more accurate resource rankings with higher recall. We first describe Latent Dirichlet Allocation (LDA) and then show how it can be used to rank resources in a social bookmarking system. We test the LDA tagging model and compare it with 3 non-topic model baselines on a large data sample obtained from the Delicious social bookmarking site. Our evaluations show that our LDA-based method significantly outperforms all of the baselines.