Transparent user models for personalization

Authors:
Khalid El-Arini;Ulrich Paquet;Ralf Herbrich;Jurgen Van Gael;Blaise Agüera y Arcas
Affiliations:
Carnegie Mellon University, Pittsburgh, PA, USA;Microsoft Research, Cambridge, United Kingdom;Facebook, Inc., Menlo Park, CA, USA;Rangespan Ltd., London, United Kingdom;Microsoft Corp., Bellevue, WA, USA
Venue:
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2012

Citing 7
Cited 4

Latent dirichlet allocation

The Journal of Machine Learning Research
Matrix Factorization Techniques for Recommender Systems

Computer
Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
An architecture for parallel topic models

Proceedings of the VLDB Endowment
Distributed GraphLab: a framework for machine learning and data mining in the cloud

Proceedings of the VLDB Endowment
A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
The Filter Bubble

The Filter Bubble

Representing documents through their readers

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Estimating sharer reputation via social data calibration

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting profilable and overlapping communities with user-generated multimedia contents in LBSNs

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Sistemas ubíquos para todos: conhecendo e mapeando os diferentes perfis de interação

Proceedings of the 12th Brazilian Symposium on Human Factors in Computing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Personalization is a ubiquitous phenomenon in our daily online experience. While such technology is critical for helping us combat the overload of information we face, in many cases, we may not even realize that our results are being tailored to our personal tastes and preferences. Worse yet, when such a system makes a mistake, we have little recourse to correct it. In this work, we propose a framework for addressing this problem by developing a new user-interpretable feature set upon which to base personalized recommendations. These features, which we call badges, represent fundamental traits of users (e.g., "vegetarian" or "Apple fanboy") inferred by modeling the interplay between a user's behavior and self-reported identity. Specifically, we consider the microblogging site Twitter, where users provide short descriptions of themselves in their profiles, as well as perform actions such as tweeting and retweeting. Our approach is based on the insight that we can define badges using high precision, low recall rules (e.g., "Twitter profile contains the phrase 'Apple fanboy'"), and with enough data, generalize to other users by observing shared behavior. We develop a fully Bayesian, generative model that describes this interaction, while allowing us to avoid the pitfalls associated with having positive-only data. Experiments on real Twitter data demonstrate the effectiveness of our model at capturing rich and interpretable user traits that can be used to provide transparency for personalization.