2011 EstimatingtheNumberofUsersBehin
- (Metwally & Paduano, 2011) ⇒ Ahmed Metwally, and Matt Paduano. (2011). “Estimating the Number of Users Behind Ip Addresses for Combating Abusive Traffic.” In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2011) Journal. ISBN:978-1-4503-0813-7 doi:10.1145/2020408.2020452
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%222011%22+Estimating+the+Number+of+Users+Behind+Ip+Addresses+for+Combating+Abusive+Traffic
- http://dl.acm.org/citation.cfm?id=2020408.2020452&preflayout=flat#citedby
Quotes
Author Keywords
- Abusive traffic filtering; advertisement click fraud; algorithms; data mining; experimentation; ip size estimation; management; real data experiments; security; security and protection
Abstract
This paper addresses estimating the number of the users of a specific application behind IP addresses (IPs). This problem is central to combating abusive traffic, such as DDoS attacks, ad click fraud and email spam. We share our experience building a general framework at Google for estimating the number of users behind IPs, called hereinafter the sizes of the IPs. The primary goal of this framework is combating abusive traffic without violating the user privacy. The estimation techniques produce statistically sound estimates of sizes relying solely on passively mining aggregated application log data, without probing machines or deploying active content like Java applets. This paper also explores using the estimated sizes to detect and filter abusive traffic. The proposed framework was used to build and deploy an ad click fraud filter at Google. The first 50M clicks tagged by the filter had a significant recall of all tagged clicks, and their false positive rate was below 1.4%. For the sake of comparison, we simulated a naive IP-based filter that does not consider the sizes of the IPs. To reach a comparable recall, the naive filter's false positive rate was 37% due to aggressive tagging.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2011 EstimatingtheNumberofUsersBehin | Ahmed Metwally Matt Paduano | Estimating the Number of Users Behind Ip Addresses for Combating Abusive Traffic | 10.1145/2020408.2020452 | 2011 |