Longtime behavior of harvesting spam bots

  • Authors:
  • Oliver Hohlfeld;Thomas Graf;Florin Ciucu

  • Affiliations:
  • TU Berlin / Telekom Innovation Laboratories, Berlin, Germany;Modas GmbH, Berlin, Germany;TU Berlin / Telekom Innovation Laboratories, Berlin, Germany

  • Venue:
  • Proceedings of the 2012 ACM conference on Internet measurement conference
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper investigates the origins of the spamming process, specifically concerning address harvesting on the web, by relying on an extensive measurement data set spanning over three years. Concretely, we embedded more than 23 million unique spamtrap addresses in web pages. 0.5% of the embedded trap addresses received a total of 620,000 spam messages. Besides the scale of the experiment, the critical aspect of our methodology is the uniqueness of the issued spamtrap addresses, which enables the mapping of crawling activities to the actual spamming process. Our observations suggest that simple obfuscation methods are still efficient for protecting addresses from being harvested. A key finding is that search engines are used as proxies, either to hide the identity of the harvester or to optimize the harvesting process.