Filter-resistant code injection on ARM

  • Authors:
  • Yves Younan;Pieter Philippaerts;Frank Piessens;Wouter Joosen;Sven Lachmund;Thomas Walter

  • Affiliations:
  • Katholieke Universiteit Leuven, Leuven, Belgium;Katholieke Universiteit Leuven, Leuven, Belgium;Katholieke Universiteit Leuven, Leuven, Belgium;Katholieke Universiteit Leuven, Leuven, Belgium;DOCOMO Euro-Labs, Munich, Germany;DOCOMO Euro-Labs, Munich, Germany

  • Venue:
  • Proceedings of the 16th ACM conference on Computer and communications security
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

Code injections attacks are one of the most powerful and important classes of attacks on software. In such attacks, the attacker sends malicious input to a software application, where it is stored in memory. The malicious input is chosen in such a way that its representation in memory is also a valid representation of a machine code program that performs actions chosen by the attacker. The attacker then triggers a bug in the application to divert the control flow to this injected machine code. A typical action of the injected code is to launch a command interpreter shell, and hence the malicious input is often called shellcode. Attacks are usually performed against network facing applications, and such applications often perform validations or encodings on input. Hence, a typical hurdle for attackers, is that the shellcode has to pass one or more filtering methods before it is stored in the vulnerable application's memory space. Clearly, for a code injection attack to succeed, the malicious input must survive such validations and transformations. Alphanumeric input (consisting only of letters and digits) is typically very robust for this purpose: it passes most filters and is untouched by most transformations. This paper studies the power of alphanumeric shellcode on the ARM 32 bit RISC processor. It shows that the subset of ARM machine code programs that (when interpreted as data) consist only of alphanumerical characters is a Turing complete subset. This is a non-trivial result, as the number of instructions that consist only of alphanumeric characters is very limited. To craft useful exploit code (and to achieve Turing completeness), several tricks are needed, including the use of self-modifying code.