Securing web service by automatic robot detection

Authors:
KyoungSoo Park;Vivek S. Pai;Kang-Won Lee;Seraphin Calo
Affiliations:
Princeton University;Princeton University;IBM T.J. Watson Research Center;IBM T.J. Watson Research Center
Venue:
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Year:
2006

Citing 0
Cited 19

DDoS defense by offense

Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
Experience-driven experimental systems research

Communications of the ACM
Web robot detection in the scholarly information environment

Journal of Information Science
HoneySpam 2.0: Profiling Web Spambot Behaviour

PRIMA '09 Proceedings of the 12th International Conference on Principles of Practice in Multi-Agent Systems
DDoS defense by offense

ACM Transactions on Computer Systems (TOCS)
Modeling human behavior for defense against flash-crowd attacks

ICC'09 Proceedings of the 2009 IEEE international conference on Communications
Suppressing bot traffic with accurate human attestation

Proceedings of the first ACM asia-pacific workshop on Workshop on systems
HengHa: data harvesting detection on hidden databases

Proceedings of the 2010 ACM workshop on Cloud computing security workshop
Web robot detection techniques: overview and limitations

Data Mining and Knowledge Discovery
Adversarial Web Search

Foundations and Trends in Information Retrieval
Privad: practical privacy in online advertising

Proceedings of the 8th USENIX conference on Networked systems design and implementation
Towards understanding modern web traffic

Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference
Behaviour-Based web spambot detection by utilising action time and action frequency

ICCSA'10 Proceedings of the 2010 international conference on Computational Science and Its Applications - Volume Part II
Feature evaluation for web crawler detection with data mining techniques

Expert Systems with Applications: An International Journal
Analysis of web logs: challenges and findings

PERFORM'10 Proceedings of the 2010 IFIP WG 6.3/7.3 international conference on Performance Evaluation of Computer and Communication Systems: milestones and future challenges
How much money do spammers make from your website?

Proceedings of the CUBE International Information Technology Conference
Specification and validation of enterprise information security policies

Proceedings of the CUBE International Information Technology Conference
Detection of malicious and non-malicious website visitors using unsupervised neural network learning

Applied Soft Computing
Blog or block: Detecting blog bots through behavioral biometrics

Computer Networks: The International Journal of Computer and Telecommunications Networking

Quantified Score

Hi-index	0.02

Visualization

Abstract

Web sites are routinely visited by automated agents known as Web robots, that perform acts ranging from the beneficial, such as indexing for search engines, to the malicious, such as searching for vulnerabilities, attempting to crack passwords, or spamming bulletin boards. Previous work to identify malicious robots has relied on ad-hoc signature matching and has been performed on a per-site basis. As Web robots evolve and diversify, these techniques have not been scaling. We approach the problem as a special form of the Turing test and defend the system by inferring if the traffic source is human or robot. By extracting the implicit patterns of human Web browsing, we develop simple yet effective algorithms to detect human users. Our experiments with the CoDeeN content distribution network show that 95% of human users are detected within the first 57 requests, and 80% can be identified in only 20 requests, with a maximum false positive rate of 2.4%. In the time that this system has been deployed on CoDeeN, robot-related abuse complaints have dropped by a factor of 10.