Autonomous learning for detection of JavaScript attacks: vision or reality?

  • Authors:
  • Guido Schwenk;Alexander Bikadorov;Tammo Krueger;Konrad Rieck

  • Affiliations:
  • Technische Universität Berlin, Berlin, Germany;Technische Universität Berlin, Berlin, Germany;Technische Universität Berlin, Berlin, Germany;University of Göttingen, Göttingen, Germany

  • Venue:
  • Proceedings of the 5th ACM workshop on Security and artificial intelligence
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Malicious JavaScript code in webpages is a pressing problem in the Internet. Classic security tools, such as anti-virus scanners, are hardly able to keep abreast of these attacks, as their obfuscation and complexity obstructs the manual generation of signatures. Recently, several methods have been proposed that combine JavaScript analysis with machine learning for automatically generating detection models. However, it is open how these methods can really operate autonomously and update detection models without manual intervention. In this paper, we present an empirical study of a fully automated system for collecting, analyzing and detecting malicious JavaScript code. The system is evaluated on a dataset of 3.4 million benign and 8,282 malicious webpages, which has been collected in a completely automated manner over a period of 5 months. The results of our study are mixed: For manually verified data excellent detection rates up to 93% are achievable, yet for fully automated learning only 67% of the malicious code is identified. We conclude that fully automated systems are still a vision and several challenges need to be solved first.