Arboricity and subgraph listing algorithms
SIAM Journal on Computing
Min-wise independent permutations (extended abstract)
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
On the security of pay-per-click and other Web advertising schemes
WWW '99 Proceedings of the eighth international conference on World Wide Web
Secure and lightweight advertising on the Web
WWW '99 Proceedings of the eighth international conference on World Wide Web
A small approximately min-wise independent family of hash functions
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Min-Wise versus linear independence (extended abstract)
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Algorithm 457: finding all cliques of an undirected graph
Communications of the ACM
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
External memory algorithms and data structures: dealing with massive data
ACM Computing Surveys (CSUR)
Similarity estimation techniques from rounding algorithms
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
SAWM: a tool for secure and authenticated web metering
SEKE '02 Proceedings of the 14th international conference on Software engineering and knowledge engineering
Massive Quasi-Clique Detection
LATIN '02 Proceedings of the 5th Latin American Symposium on Theoretical Informatics
Defending Against the Wily Surfer-Web-based Attacks and Defenses
Proceedings of the Workshop on Intrusion Detection and Network Monitoring
Conductance and congestion in power law graphs
SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
On clusterings-good, bad and spectral
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
On the Resemblance and Containment of Documents
SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
Cluster graph modification problems
Discrete Applied Mathematics - Discrete mathematics & data mining (DM & DM)
Duplicate detection in click streams
WWW '05 Proceedings of the 14th international conference on World Wide Web
A divide-and-merge methodology for clustering
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Using association rules for fraud detection in web advertising networks
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Discovering large dense subgraphs in massive graphs
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Detecting hit shaving in click-through payment schemes
WOEC'98 Proceedings of the 3rd conference on USENIX Workshop on Electronic Commerce - Volume 3
The complexity of detecting fixed-density clusters
CIAC'03 Proceedings of the 5th Italian conference on Algorithms and complexity
On the NP-Completeness of some graph cluster measures
SOFSEM'06 Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
Scaling up all pairs similarity search
Proceedings of the 16th international conference on World Wide Web
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
A few bad votes too many?: towards robust ranking in social media
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
SLEUTH: Single-pubLisher attack dEtection Using correlaTion Hunting
Proceedings of the VLDB Endowment
Power-law based estimation of set similarity join size
Proceedings of the VLDB Endowment
SBotMiner: large scale search bot detection
Proceedings of the third ACM international conference on Web search and data mining
Efficient parallel set-similarity joins using MapReduce
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
pq-hash: an efficient method for approximate XML joins
WAIM'10 Proceedings of the 2010 international conference on Web-age information management
The dark side of the Internet: Attacks, costs and responses
Information Systems
Foundations and Trends in Information Retrieval
Estimating the number of users behind ip addresses for combating abusive traffic
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Spam or ham?: characterizing and detecting fraudulent "not spam" reports in web mail systems
Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference
Understanding fraudulent activities in online ad exchanges
Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference
V-SMART-join: a scalable mapreduce framework for all-pair similarity joins of multisets and vectors
Proceedings of the VLDB Endowment
Measuring and fingerprinting click-spam in ad networks
Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication
Measuring and fingerprinting click-spam in ad networks
ACM SIGCOMM Computer Communication Review - Special october issue SIGCOMM '12
Optimizing parallel algorithms for all pairs similarity search
Proceedings of the sixth ACM international conference on Web search and data mining
Cache-conscious performance optimization for similarity search
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Scalable all-pairs similarity search in metric spaces
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
ViceROI: catching click-spam in search ad networks
Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security
DECAF: detecting and characterizing ad fraud in mobile apps
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Hi-index | 0.00 |
Click fraud is jeopardizing the industry of Internet advertising. Internet advertising is crucial for the thriving of the entire Internet, since it allows producers to advertise their products, and hence contributes to the well being of e-commerce. Moreover, advertising supports the intellectual value of the Internet by covering the running expenses of publishing content. Some content publishers are dishonest, and use automation to generate traffic to defraud the advertisers. Similarly, some advertisers automate clicks on the advertisements of their competitors to deplete their competitors' advertising budgets. This paper describes the advertising network model, and focuses on the most sophisticated type of fraud, which involves coalitions among fraudsters. We build on several published theoretical results to devise the Similarity-Seeker algorithm that discovers coalitions made by pairs of fraudsters. We then generalize the solution to coalitions of arbitrary sizes. Before deploying our system on a real network, we conducted comprehensive experiments on data samples for proof of concept. The results were very accurate. We detected several coalitions, formed using various techniques, and spanning numerous sites. This reveals the generality of our model and approach.