Updating mean and variance estimates: an improved method
Communications of the ACM
A technique for counting natted hosts
Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment
Remote Physical Device Fingerprinting
SP '05 Proceedings of the 2005 IEEE Symposium on Security and Privacy
Hot or not: revealing hidden services by their clock skew
Proceedings of the 13th ACM conference on Computer and communications security
Interpreting the data: Parallel analysis with Sawzall
Scientific Programming - Dynamic Grids and Worldwide Computing
Detectives: detecting coalition hit inflation attacks in advertising networks streams
Proceedings of the 16th international conference on World Wide Web
Geographic locality of IP prefixes
IMC '05 Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement
Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications
Filtering spam with behavioral blacklisting
Proceedings of the 14th ACM conference on Computer and communications security
Spamscatter: characterizing internet scam hosting infrastructure
SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Characterizing botnets from email spam records
LEET'08 Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats
SLEUTH: Single-pubLisher attack dEtection Using correlaTion Hunting
Proceedings of the VLDB Endowment
Peering through the shroud: the effect of edge opacity on ip-based client identification
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
V-SMART-join: a scalable mapreduce framework for all-pair similarity joins of multisets and vectors
Proceedings of the VLDB Endowment
Populated IP addresses: classification and applications
Proceedings of the 2012 ACM conference on Computer and communications security
Trafficking fraudulent accounts: the role of the underground market in Twitter spam and abuse
SEC'13 Proceedings of the 22nd USENIX conference on Security
Hyperlocal: inferring location of IP addresses in real-time bid requests for mobile ads
Proceedings of the 6th ACM SIGSPATIAL International Workshop on Location-Based Social Networks
Hi-index | 0.00 |
This paper addresses estimating the number of the users of a specific application behind IP addresses (IPs). This problem is central to combating abusive traffic, such as DDoS attacks, ad click fraud and email spam. We share our experience building a general framework at Google for estimating the number of users behind IPs, called hereinafter the sizes of the IPs. The primary goal of this framework is combating abusive traffic without violating the user privacy. The estimation techniques produce statistically sound estimates of sizes relying solely on passively mining aggregated application log data, without probing machines or deploying active content like Java applets. This paper also explores using the estimated sizes to detect and filter abusive traffic. The proposed framework was used to build and deploy an ad click fraud filter at Google. The first 50M clicks tagged by the filter had a significant recall of all tagged clicks, and their false positive rate was below 1.4%. For the sake of comparison, we simulated a naive IP-based filter that does not consider the sizes of the IPs. To reach a comparable recall, the naive filter's false positive rate was 37% due to aggressive tagging.