From NLP (natural language processing) to MLP (machine language processing)

Authors:
Peter Teufl;Udo Payer;Guenter Lackner
Affiliations:
Institute for Applied Information Processing and Communications, Graz University of Technology;CAMPUS02, Graz University of Applied Science;Studio78, Graz
Venue:
MMM-ACNS'10 Proceedings of the 5th international conference on Mathematical methods, models and architectures for computer network security
Year:
2010

Citing 9
Cited 1

Application of Spreading Activation Techniques in InformationRetrieval

Artificial Intelligence Review
Similarity between words computed by spreading activation on an English dictionary

EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
Robust growing neural gas algorithm with application in cluster analysis

Neural Networks - 2004 Special issue: New developments in self-organizing systems
Cobra: Fine-grained Malware Analysis using Stealth Localized-executions

SP '06 Proceedings of the 2006 IEEE Symposium on Security and Privacy
BitBlaze: A New Approach to Computer Security via Binary Analysis

ICISS '08 Proceedings of the 4th International Conference on Information Systems Security
Automated Analysis of e-Participation Data by Utilizing Associative Networks, Spreading Activation and Unsupervised Learning

ePart '09 Proceedings of the 1st International Conference on Electronic Participation
Word sense disambiguation with spreading activation networks generated from thesauri

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Hybrid engine for polymorphic shellcode detection

DIMVA'05 Proceedings of the Second international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Massive data mining for polymorphic code detection

MMM-ACNS'05 Proceedings of the Third international conference on Mathematical Methods, Models, and Architectures for Computer Network Security

Semantic Pattern Transformation: Applying Knowledge Discovery Processes in Heterogeneous Domains

Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Natural Language Processing (NLP) in combination with Machine Learning techniques plays an important role in the field of automatic text analysis. Motivated by the successful use of NLP in solving text classification problems in the area of e-Participation and inspired by our prior work in the field of polymorphic shellcode detection we gave classical NLP-processes a trial in the special case of malicious code analysis. Any malicious program is based on some kind of machine language, ranging from manually crafted assembler code that exploits a buffer overflow to high level languages such as Javascript used in web-based attacks. We argue that well known NLP analysis processes can be modified and applied to the malware analysis domain. Similar to the NLP process we call this process Machine Language Processing (MLP). In this paper, we use our e-Participation analysis architecture, extract the various NLP techniques and adopt them for the malware analysis process. As proof-of-concept we apply the adopted framework to malicious code examples from Metasploit.