Improving social bookmark search using personalised latent variable language models

Authors:
Morgan Harvey;Ian Ruthven;Mark J. Carman
Affiliations:
University of Strathclyde, Glasgow, United Kingdom;University of Strathclyde, Glasgow, United Kingdom;University of Lugano, Lugano, Switzerland
Venue:
Proceedings of the fourth ACM international conference on Web search and data mining
Year:
2011

Citing 15
Cited 7

Unsupervised learning by probabilistic latent semantic analysis

Machine Learning
Latent dirichlet allocation

The Journal of Machine Learning Research
Usage patterns of collaborative tagging systems

Journal of Information Science
LDA-based document models for ad-hoc retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Query performance prediction

Information Systems
A large-scale evaluation and analysis of personalized search strategies

Proceedings of the 16th international conference on World Wide Web
Flickr tag recommendation based on collective knowledge

Proceedings of the 17th international conference on World Wide Web
To personalize or not to personalize: modeling queries with variation in user intent

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Factorization meets the neighborhood: a multifaceted collaborative filtering model

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering the tagged web

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Personalization of tagging systems

Information Processing and Management: an International Journal
Latent dirichlet allocation for tag recommendation

Proceedings of the third ACM conference on Recommender systems
Information retrieval in folksonomies: search and ranking

ESWC'06 Proceedings of the 3rd European conference on The Semantic Web: research and applications
Personalizing web search with folksonomy-based user and document profiles

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Tripartite hidden topic models for personalised tag suggestion

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval

Comparing tweets and tags for URLs

ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Improving search via personalized query expansion using social media

Information Retrieval
Social-network analysis using topic models

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Exploring generative models of tripartite graphs for recommendation in social media

Proceedings of the 4th International Workshop on Modeling Social Media
Incorporating popularity in topic models for social network analysis

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Building user profiles from topic models for personalised search

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Social Link Prediction in Online Social Tagging Systems

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Social tagging systems have recently become very popular as a method of categorising information online and have been used to annotate a wide range of different resources. In such systems users are free to choose whatever keywords or "tags" they wish to annotate each resource, resulting in a highly personalised, unrestricted vocabulary. While this freedom of choice has several notable advantages, it does come at the cost of making searching of these systems more difficult as the vocabulary problem introduced is more pronounced than in a normal information retrieval setting. In this paper we propose to use hidden topic models as a principled way of reducing the dimensionality of this data to provide more accurate resource rankings with higher recall. We first describe Latent Dirichlet Allocation (LDA), a simple topic model and then introduce 2 extended models which can be used to personalise the results by including information about the user who made each annotation. We test these 3 models and compare them with 3 non-topic model baselines on a large data sample obtained from the Delicious social bookmarking site. Our evaluations show that our methods significantly outperform all of the baselines with the personalised models also improving significantly upon unpersonalised LDA.