Tupni: automatic reverse engineering of input formats

Authors:
Weidong Cui;Marcus Peinado;Karl Chen;Helen J. Wang;Luis Irun-Briz
Affiliations:
Microsoft Research, Redmond, WA, USA;Microsoft Corporation, Redmond, WA, USA;University of California, Berkeley, CA, USA;Microsoft Research, Redmond, WA, USA;Microsoft Corporation, Redmond, WA, USA
Venue:
Proceedings of the 15th ACM conference on Computer and communications security
Year:
2008

Citing 24
Cited 29

Bro: a system for detecting network intruders in real-time

Computer Networks: The International Journal of Computer and Telecommunications Networking
Greedy local improvement and weighted set packing approximation

Journal of Algorithms
Inside Windows File Formats

Inside Windows File Formats
Shield: vulnerability-driven network filters for preventing known vulnerability exploits

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
Secure program execution via dynamic information flow tracking

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Minos: Control Data Attack Prevention Orthogonal to Memory Model

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Vigilante: end-to-end containment of internet worms

Proceedings of the twentieth ACM symposium on Operating systems principles
ScriptGen: an automated script generation tool for honeyd

ACSAC '05 Proceedings of the 21st Annual Computer Security Applications Conference
String analysis for x86 binaries

PASTE '05 Proceedings of the 6th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
On the complexity of approximating k-set packing

Computational Complexity
Extracting Output Formats from Executables

WCRE '06 Proceedings of the 13th Working Conference on Reverse Engineering
Semi-automated discovery of application session structure

Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
binpac: a yacc for writing application protocol parsers

Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Compilers: Principles, Techniques, and Tools (2nd Edition)

Compilers: Principles, Techniques, and Tools (2nd Edition)
Replayer: automatic protocol replay by binary analysis

Proceedings of the 13th ACM conference on Computer and communications security
Framework for instruction-level tracing and analysis of program executions

Proceedings of the 2nd international conference on Virtual execution environments
ShieldGen: Automatic Data Patch Generation for Unknown Vulnerabilities with Informed Probing

SP '07 Proceedings of the 2007 IEEE Symposium on Security and Privacy
Bouncer: securing software by blocking bad input

Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Polyglot: automatic extraction of protocol message format using dynamic binary analysis

Proceedings of the 14th ACM conference on Computer and communications security
The Daikon system for dynamic detection of likely invariants

Science of Computer Programming
From dirt to shovels: fully automatic tool generation from ad hoc data

Proceedings of the 35th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Discoverer: automatic protocol reverse engineering from network traces

SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Deriving input syntactic structure from execution

Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
Automatic handling of protocol dependencies and reaction to 0-day attacks with scriptgen based honeypots

RAID'06 Proceedings of the 9th international conference on Recent Advances in Intrusion Detection

Deriving input syntactic structure from execution

Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
Polymorphing Software by Randomizing Data Structure Layout

DIMVA '09 Proceedings of the 6th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Dispatcher: enabling active botnet infiltration using automatic protocol reverse-engineering

Proceedings of the 16th ACM conference on Computer and communications security
Towards Generating High Coverage Vulnerability-Based Signatures with Protocol-Level Constraint-Guided Exploration

RAID '09 Proceedings of the 12th International Symposium on Recent Advances in Intrusion Detection
Automated Behavioral Fingerprinting

RAID '09 Proceedings of the 12th International Symposium on Recent Advances in Intrusion Detection
Input-driven dynamic execution prediction of streaming applications

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Reverse engineering of binary device drivers with RevNIC

Proceedings of the 5th European conference on Computer systems
ReFormat: automatic reverse engineering of encrypted messages

ESORICS'09 Proceedings of the 14th European conference on Research in computer security
Automatically identifying critical input regions and code in applications

Proceedings of the 19th international symposium on Software testing and analysis
Glasnost: enabling end users to detect traffic differentiation

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Inference and analysis of formal models of botnet command and control protocols

Proceedings of the 17th ACM conference on Computer and communications security
Reverse engineering for mobile systems forensics with Ares

Proceedings of the 2010 ACM workshop on Insider threats
Efficient file fuzz testing using automated analysis of binary file format

Journal of Systems Architecture: the EUROMICRO Journal
Automatically complementing protocol specifications from network traces

EWDC '11 Proceedings of the 13th European Workshop on Dependable Computing
Checksum-Aware Fuzzing Combined with Dynamic Taint Analysis and Symbolic Execution

ACM Transactions on Information and System Security (TISSEC)
Checking conformance of a producer and a consumer

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Forensic triage for mobile phones with DEC0DE

SEC'11 Proceedings of the 20th USENIX conference on Security
MACE: model-inference-assisted concolic exploration for protocol and vulnerability discovery

SEC'11 Proceedings of the 20th USENIX conference on Security
Detection and analysis of cryptographic data inside software

ISC'11 Proceedings of the 14th international conference on Information security
B@bel: leveraging email delivery for spam mitigation

Security'12 Proceedings of the 21st USENIX conference on Security symposium
Learning stateful models for network honeypots

Proceedings of the 5th ACM workshop on Security and artificial intelligence
PeerPress: utilizing enemies' P2P strength against them

Proceedings of the 2012 ACM conference on Computer and communications security
Learning fine-grained structured input for memory corruption detection

ISC'12 Proceedings of the 15th international conference on Information Security
Automatic protocol reverse-engineering: Message format extraction and field semantics inference

Computer Networks: The International Journal of Computer and Telecommunications Networking
Bridging the Semantic Gap in Virtual Machine Introspection via Online Kernel Data Redirection

ACM Transactions on Information and System Security (TISSEC)
Theory propagation and rational-trees

Proceedings of the 15th Symposium on Principles and Practice of Declarative Programming
Obfuscation resilient binary code reuse through trace-oriented programming

Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security
Tappan Zee (north) bridge: mining memory accesses for introspection

Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security
Reverse extraction of protocol model from network applications

International Journal of Internet Protocol Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent work has established the importance of automatic reverse engineering of protocol or file format specifications. However, the formats reverse engineered by previous tools have missed important information that is critical for security applications. In this paper, we present Tupni, a tool that can reverse engineer an input format with a rich set of information, including record sequences, record types, and input constraints. Tupni can generalize the format specification over multiple inputs. We have implemented a prototype of Tupni and evaluated it on ten different formats: five file formats (WMF, BMP, JPG, PNG and TIF) and five network protocols (DNS, RPC, TFTP, HTTP and FTP). Tupni identified all record sequences in the test inputs. We also show that, by aggregating over multiple WMF files, Tupni can derive a more complete format specification for WMF. Furthermore, we demonstrate the utility of Tupni by using the rich information it provides for zero-day vulnerability signature generation, which was not possible with previous reverse engineering tools.