From NLP (natural language processing) to MLP (machine language processing)

  • Authors:
  • Peter Teufl;Udo Payer;Guenter Lackner

  • Affiliations:
  • Institute for Applied Information Processing and Communications, Graz University of Technology;CAMPUS02, Graz University of Applied Science;Studio78, Graz

  • Venue:
  • MMM-ACNS'10 Proceedings of the 5th international conference on Mathematical methods, models and architectures for computer network security
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Natural Language Processing (NLP) in combination with Machine Learning techniques plays an important role in the field of automatic text analysis. Motivated by the successful use of NLP in solving text classification problems in the area of e-Participation and inspired by our prior work in the field of polymorphic shellcode detection we gave classical NLP-processes a trial in the special case of malicious code analysis. Any malicious program is based on some kind of machine language, ranging from manually crafted assembler code that exploits a buffer overflow to high level languages such as Javascript used in web-based attacks. We argue that well known NLP analysis processes can be modified and applied to the malware analysis domain. Similar to the NLP process we call this process Machine Language Processing (MLP). In this paper, we use our e-Participation analysis architecture, extract the various NLP techniques and adopt them for the malware analysis process. As proof-of-concept we apply the adopted framework to malicious code examples from Metasploit.