The Journal of Machine Learning Research
Adaptive web search based on user profile constructed without any effort from users
Proceedings of the 13th international conference on World Wide Web
ICML '06 Proceedings of the 23rd international conference on Machine learning
Introduction to Information Retrieval
Introduction to Information Retrieval
Turning down the noise in the blogosphere
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
A contextual-bandit approach to personalized news article recommendation
Proceedings of the 19th international conference on World wide web
The power of convex relaxation: near-optimal matrix completion
IEEE Transactions on Information Theory
From chatter to headlines: harnessing the real-time web for personalized news recommendation
Proceedings of the fifth ACM international conference on Web search and data mining
Optimizing semantic coherence in topic models
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Transparent user models for personalization
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
PowerGraph: distributed graph-parallel computation on natural graphs
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Hi-index | 0.00 |
From Twitter to Facebook to Reddit, users have become accustomed to sharing the articles they read with friends or followers on their social networks. While previous work has modeled what these shared stories say about the user who shares them, the converse question remains unexplored: what can we learn about an article from the identities of its likely readers? To address this question, we model the content of news articles and blog posts by attributes of the people who are likely to share them. For example, many Twitter users describe themselves in a short profile, labeling themselves with phrases such as "vegetarian" or "liberal." By assuming that a user's labels correspond to topics in the articles he shares, we can learn a labeled dictionary from a training corpus of articles shared on Twitter. Thereafter, we can code any new document as a sparse non-negative linear combination of user labels, where we encourage correlated labels to appear together in the output via a structured sparsity penalty. Finally, we show that our approach yields a novel document representation that can be effectively used in many problem settings, from recommendation to modeling news dynamics. For example, while the top politics stories will change drastically from one month to the next, the "politics" label will still be there to describe them. We evaluate our model on millions of tweeted news articles and blog posts collected between September 2010 and September 2012, demonstrating that our approach is effective.