Inferring protocol state machine from network traces: a probabilistic approach

Authors:
Yipeng Wang;Zhibin Zhang;Danfeng Daphne Yao;Buyun Qu;Li Guo
Affiliations:
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China and Graduate University, Chinese Academy of Sciences, Beijing, China;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China;Department of Computer Science, Virginia Tech, Blacksburg, VA;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China and Graduate University, Chinese Academy of Sciences, Beijing, China;Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Venue:
ACNS'11 Proceedings of the 9th international conference on Applied cryptography and network security
Year:
2011

Citing 13
Cited 2

Kendall's advanced theory of statistics

Kendall's advanced theory of statistics
Probabilistic Finite-State Machines-Part I

IEEE Transactions on Pattern Analysis and Machine Intelligence
ACAS: automated construction of application signatures

Proceedings of the 2005 ACM SIGCOMM workshop on Mining network data
ScriptGen: an automated script generation tool for honeyd

ACSAC '05 Proceedings of the 21st Annual Computer Security Applications Conference
Extracting Output Formats from Executables

WCRE '06 Proceedings of the 13th Working Conference on Reverse Engineering
Semi-automated discovery of application session structure

Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Unexpected means of protocol inference

Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Dynamic application-layer protocol analysis for network intrusion detection

USENIX-SS'06 Proceedings of the 15th conference on USENIX Security Symposium - Volume 15
Polyglot: automatic extraction of protocol message format using dynamic binary analysis

Proceedings of the 14th ACM conference on Computer and communications security
Discoverer: automatic protocol reverse engineering from network traces

SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Towards automatic discovery of deviations in binary implementations with applications to error detection and fingerprint generation

SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Prospex: Protocol Specification Extraction

SP '09 Proceedings of the 2009 30th IEEE Symposium on Security and Privacy
GT: picking up the truth from the ground for internet traffic

ACM SIGCOMM Computer Communication Review

On integrating structure and behavior modeling with OCL

MODELS'12 Proceedings of the 15th international conference on Model Driven Engineering Languages and Systems
Reverse extraction of protocol model from network applications

International Journal of Internet Protocol Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Application-level protocol specifications (i.e., how a protocol should behave) are helpful for network security management, including intrusion detection and intrusion prevention. The knowledge of protocol specifications is also an effective way of detecting malicious code. However, current methods for obtaining unknown protocol specifications highly rely on manual operations, such as reverse engineering which is a major instrument for extracting application-level specifications but is time-consuming and laborious. Several works have focus their attentions on extracting protocol messages from real-world trace automatically, and leave protocol state machine unsolved. In this paper, we propose Veritas, a system that can automatically infer protocol state machine from real-world network traces. The main feature of Veritas is that it has no prior knowledge of protocol specifications, and our technique is based on the statistical analysis on the protocol formats. We also formally define a new model - probabilistic protocol state machine (P-PSM), which is a probabilistic generalization of protocol state machine. In our experiments, we evaluate a text-based protocol and two binary-based protocols to test the performance of Veritas. Our results show that the protocol state machines that Veritas infers can accurately represent 92% of the protocol flows on average. Our system is general and suitable for both text-based and binary-based protocols. Veritas can also be employed as an auxiliary tool for analyzing unknown behaviors in real-world applications.