Sketching Algorithms for Approximating Rank Correlations in Collaborative Filtering Systems

Authors:
Yoram Bachrach;Ralf Herbrich;Ely Porat
Affiliations:
Microsoft Research Ltd., Cambridge, UK;Microsoft Research Ltd., Cambridge, UK;Department of Computer Science, Bar-Ilan University, Ramat-Gan, Israel
Venue:
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Year:
2009

Citing 8
Cited 5

GroupLens: an open architecture for collaborative filtering of netnews

CSCW '94 Proceedings of the 1994 ACM conference on Computer supported cooperative work
Social information filtering: algorithms for automating “word of mouth”

CHI '95 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A small approximately min-wise independent family of hash functions

Journal of Algorithms
An Approximate L1-Difference Algorithm for Massive Data Streams

SIAM Journal on Computing
Similarity Search in High Dimensions via Hashing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Database-friendly random projections: Johnson-Lindenstrauss with binary coins

Journal of Computer and System Sciences - Special issu on PODS 2001
Semantic hashing

International Journal of Approximate Reasoning
Sketching techniques for collaborative filtering

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence

Fingerprinting ratings for collaborative filtering: theoretical and empirical analysis

SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Exponential time improvement for min-wise based algorithms

Information and Computation
Exponential time improvement for min-wise based algorithms

Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Bottom-k and priority sampling, set similarity and subset sums with minimal independence

Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Sketching for big data recommender systems using fast pseudo-random fingerprints

ICALP'13 Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Collaborative filtering (CF) shares information between users to provide each with recommendations. Previous work suggests using sketching techniques to handle massive data sets in CF systems, but only allows testing whether users have a high proportion of items they have both ranked. We show how to determine the correlation between the rankings of two users, using concise "sketches" of the rankings. The sketches allow approximating Kendall's Tau, a known rank correlation, with high accuracy *** and high confidence 1 *** *** . The required sketch size is logarithmic in the confidence and polynomial in the accuracy.