GroupLens: an open architecture for collaborative filtering of netnews
CSCW '94 Proceedings of the 1994 ACM conference on Computer supported cooperative work
Social information filtering: algorithms for automating “word of mouth”
CHI '95 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Min-wise independent permutations
Journal of Computer and System Sciences - 30th annual ACM symposium on theory of computing
A small approximately min-wise independent family of hash functions
Journal of Algorithms
An Approximate L1-Difference Algorithm for Massive Data Streams
SIAM Journal on Computing
Comparing Data Streams Using Hamming Norms (How to Zero In)
IEEE Transactions on Knowledge and Data Engineering
Database-friendly random projections: Johnson-Lindenstrauss with binary coins
Journal of Computer and System Sciences - Special issu on PODS 2001
The complexity of Kemeny elections
Theoretical Computer Science
A survey of trust and reputation systems for online service provision
Decision Support Systems
Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Lessons from the Netflix prize challenge
ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
Gossip-based aggregation of trust in decentralized reputation systems
Autonomous Agents and Multi-Agent Systems
Sketching Algorithms for Approximating Rank Correlations in Collaborative Filtering Systems
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Sketching techniques for collaborative filtering
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Approximating power indices: theoretical and empirical analysis
Autonomous Agents and Multi-Agent Systems
ESA'07 Proceedings of the 15th annual European conference on Algorithms
Empirical analysis of predictive algorithms for collaborative filtering
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Exponential time improvement for min-wise based algorithms
Information and Computation
Learning binary codes for collaborative filtering
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Homomorphic fingerprints under misalignments: sketching edit and shift distances
Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Sketching for big data recommender systems using fast pseudo-random fingerprints
ICALP'13 Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part II
Hi-index | 0.00 |
We consider fingerprinting methods for collaborative filtering (CF) systems. In general, CF systems show their real strength when supplied with enormous data sets. Earlier work already suggests sketching techniques to handle massive amounts of information, but most prior analysis has so far been limited to non-ranking application scenarios and has focused mainly on a theoretical analysis. We demonstrate how to use fingerprinting methods to compute a family of rank correlation coefficients. Our methods allow identifying users who have similar rankings over a certain set of items, a problem that lies at the heart of CF applications. We show that our method allows approximating rank correlations with high accuracy and confidence. We examine the suggested methods empirically through a recommender system for the Netflix dataset, showing that the required fingerprint sizes are even smaller than the theoretical analysis suggests. We also explore the of use standard hash functions rather than min-wise independent hashes and the relation between the quality of the final recommendations and the fingerprint size.