PlanetLab: an overlay testbed for broad-coverage services
ACM SIGCOMM Computer Communication Review
Detecting semantic cloaking on the web
Proceedings of the 15th international conference on World Wide Web
An inquiry into the nature and causes of the wealth of internet miscreants
Proceedings of the 14th ACM conference on Computer and communications security
The ghost in the browser analysis of web-based malware
HotBots'07 Proceedings of the first conference on First Workshop on Hot Topics in Understanding Botnets
Spamalytics: an empirical analysis of spam marketing conversion
Proceedings of the 15th ACM conference on Computer and communications security
SS'08 Proceedings of the 17th conference on Security symposium
Re: CAPTCHAs: understanding CAPTCHA-solving services in an economic context
USENIX Security'10 Proceedings of the 19th USENIX conference on Security
Hi-index | 0.00 |
The retrieval and analysis of malicious content is an essential task for security researchers. At the same time, the distributors of malicious files deploy countermeasures to evade the scrutiny of security researchers. This paper investigates two techniques used by malware download centers: frequently updating the malicious payload, and blacklisting (i.e., refusing HTTP requests from researchers based on their IP). To this end, we sent HTTP requests to malware download centers over a period of four months. The requests are distributed across two pools of IPs, one exhibiting high volume research behaviour and another exhibiting semi-random, low volume behaviour. We identify several distinct update patterns, including sites that do not update the binary at all, sites that update the binary for each new client but then repeatedly serve a specific binary to the same client, sites that periodically update the binary with periods ranging from one hour to 84 days, and server-side polymorphic sites, that deliver new binaries for each HTTP request. From this classification we identify several guidelines for crawlers that re-query malware download centers looking for binary updates. We propose a scheduling algorithm that incorporates these guidelines, and perform a limited evaluation of the algorithm using the data we collected. We analyze our data for evidence of blacklisting and find strong evidence that a small minority of URLs blacklisted our high volume IPs, but for the majority of malicious URLs studied, there was no observable blacklisting response, despite issuing over over 1.5 million requests to 5001 different malware download centers.