Using collaborative filtering to weave an information tapestry
Communications of the ACM - Special issue on information filtering
GroupLens: an open architecture for collaborative filtering of netnews
CSCW '94 Proceedings of the 1994 ACM conference on Computer supported cooperative work
Social information filtering: algorithms for automating “word of mouth”
CHI '95 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Min-wise independent permutations
Journal of Computer and System Sciences - 30th annual ACM symposium on theory of computing
A small approximately min-wise independent family of hash functions
Journal of Algorithms
An Approximate L1-Difference Algorithm for Massive Data Streams
SIAM Journal on Computing
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Estimating Rarity and Similarity over Data Stream Windows
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Comparing Data Streams Using Hamming Norms (How to Zero In)
IEEE Transactions on Knowledge and Data Engineering
Database-friendly random projections: Johnson-Lindenstrauss with binary coins
Journal of Computer and System Sciences - Special issu on PODS 2001
Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions
Communications of the ACM - 50th anniversary issue: 1958 - 2008
International Journal of Approximate Reasoning
Sketching Algorithms for Approximating Rank Correlations in Collaborative Filtering Systems
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Interactive assistance for tour planning
SC'10 Proceedings of the 7th international conference on Spatial cognition
Fingerprinting ratings for collaborative filtering: theoretical and empirical analysis
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Exponential time improvement for min-wise based algorithms
Information and Computation
Exponential time improvement for min-wise based algorithms
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Learning binary codes for collaborative filtering
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Crowd IQ: measuring the intelligence of crowdsourcing platforms
Proceedings of the 3rd Annual ACM Web Science Conference
Bottom-k and priority sampling, set similarity and subset sums with minimal independence
Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Homomorphic fingerprints under misalignments: sketching edit and shift distances
Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Sketching for big data recommender systems using fast pseudo-random fingerprints
ICALP'13 Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part II
Efficient estimation for high similarities using odd sketches
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.00 |
Recommender systems attempt to highlight items that a target user is likely to find interesting. A common technique is to use collaborative filtering (CF), where multiple users share information so as to provide each with effective recommendations. A key aspect of CF systems is finding users whose tastes accurately reflect the tastes of some target user. Typically, the system looks for other agents who have had experience with many of the items the target user has examined, and whose classification of these items has a strong correlation with the classifications of the target user. Since the universe of items may be enormous and huge data sets are involved, sophisticated methods must be used to quickly locate appropriate other agents. We present a method for quickly determining the proportional intersection between the items that each of two users has examined, by sending and maintaining extremely concise "sketches" of the list of items. These sketches enable the approximation of the proportional intersection within a distance of ε, with a high probability of 1 - δ. Our sketching techniques are based on random min-wise independent hash functions, and use very little space and time, so they are well-suited for use in large-scale collaborative filtering systems.