Using collaborative filtering to weave an information tapestry
Communications of the ACM - Special issue on information filtering
Item-based collaborative filtering recommendation algorithms
Proceedings of the 10th international conference on World Wide Web
Eigentaste: A Constant Time Collaborative Filtering Algorithm
Information Retrieval
Amazon.com Recommendations: Item-to-Item Collaborative Filtering
IEEE Internet Computing
Item-based top-N recommendation algorithms
ACM Transactions on Information Systems (TOIS)
Being accurate is not enough: how accuracy metrics have hurt recommender systems
CHI '06 Extended Abstracts on Human Factors in Computing Systems
The Long Tail: Why the Future of Business Is Selling Less of More
The Long Tail: Why the Future of Business Is Selling Less of More
Journal of Cognitive Neuroscience
Factorization meets the neighborhood: a multifaceted collaborative filtering model
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Performance of recommender algorithms on top-n recommendation tasks
Proceedings of the fourth ACM conference on Recommender systems
Proceedings of the 2013 international conference on Intelligent user interfaces
Proceedings of the 2013 International Symposium on Wearable Computers
On mining mobile apps usage behavior for predicting apps usage in smartphones
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Ranking fraud detection for mobile apps: a holistic view
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
The Netflix competition of 2006 [2] has spurred significant activity in the recommendations field, particularly in approaches using latent factor models [3,5,8,12] However, the near ubiquity of the Netflix and the similar MovieLens datasets1 may be narrowing the generality of lessons learned in this field. At GetJar, our goal is to make appealing recommendations of mobile applications (apps). For app usage, we observe a distribution that has higher kurtosis (heavier head and longer tail) than that for the aforementioned movie datasets. This happens primarily because of the large disparity in resources available to app developers and the low cost of app publication relative to movies. In this paper we compare a latent factor (PureSVD) and a memory-based model with our novel PCA-based model, which we call Eigenapp. We use both accuracy and variety as evaluation metrics. PureSVD did not perform well due to its reliance on explicit feedback such as ratings, which we do not have. Memory-based approaches that perform vector operations in the original high dimensional space over-predict popular apps because they fail to capture the neighborhood of less popular apps. They have high accuracy due to the concentration of mass in the head, but did poorly in terms of variety of apps exposed. Eigenapp, which exploits neighborhood information in low dimensional spaces, did well both on precision and variety, underscoring the importance of dimensionality reduction to form quality neighborhoods in high kurtosis distributions.