Mean Shift, Mode Seeking, and Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
The maximum edge biclique problem is NP-complete
Discrete Applied Mathematics
A needle in a haystack: local one-class optimization
ICML '04 Proceedings of the twenty-first international conference on Machine learning
On mining cross-graph quasi-cliques
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Robust one-class clustering using hybrid global and local search
ICML '05 Proceedings of the 22nd international conference on Machine learning
A Scalable Collaborative Filtering Framework Based on Co-Clustering
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Netprobe: a fast and scalable system for fraud detection in online auction networks
Proceedings of the 16th international conference on World Wide Web
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation
The Journal of Machine Learning Research
Approximation algorithms for co-clustering
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ACM Transactions on Knowledge Discovery from Data (TKDD)
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Data warehousing and analytics infrastructure at facebook
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Proceedings of the 4th Workshop on Social Network Systems
MultiAspectForensics: Pattern Mining on Large-Scale Heterogeneous Networks with Tensor Analysis
ASONAM '11 Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining
EigenSpokes: surprising patterns and scalable community chipping in large graphs
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Network Anomaly Detection Using Co-clustering
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Hi-index | 0.00 |
How can web services that depend on user generated content discern fraudulent input by spammers from legitimate input? In this paper we focus on the social network Facebook and the problem of discerning ill-gotten Page Likes, made by spammers hoping to turn a profit, from legitimate Page Likes. Our method, which we refer to as CopyCatch, detects lockstep Page Like patterns on Facebook by analyzing only the social graph between users and Pages and the times at which the edges in the graph (the Likes) were created. We offer the following contributions: (1) We give a novel problem formulation, with a simple concrete definition of suspicious behavior in terms of graph structure and edge constraints. (2) We offer two algorithms to find such suspicious lockstep behavior - one provably-convergent iterative algorithm and one approximate, scalable MapReduce implementation. (3) We show that our method severely limits "greedy attacks" and analyze the bounds from the application of the Zarankiewicz problem to our setting. Finally, we demonstrate and discuss the effectiveness of CopyCatch at Facebook and on synthetic data, as well as potential extensions to anomaly detection problems in other domains. CopyCatch is actively in use at Facebook, searching for attacks on Facebook's social graph of over a billion users, many millions of Pages, and billions of Page Likes.