A revolution: belief propagation in graphs with cycles
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
SETI@HOME—massively distributed computing for SETI
Computing in Science and Engineering
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Extracting safe and precise control flow from binaries
RTCSA '00 Proceedings of the Seventh International Conference on Real-Time Systems and Applications
Obfuscation of executable code to improve resistance to static disassembly
Proceedings of the 10th ACM conference on Computer and communications security
Distributed computing in practice: the Condor experience: Research Articles
Concurrency and Computation: Practice & Experience - Grid Performance
Static disassembly of obfuscated binaries
SSYM'04 Proceedings of the 13th conference on USENIX Security Symposium - Volume 13
Trust region Newton methods for large-scale logistic regression
Proceedings of the 24th international conference on Machine learning
Binary analysis for measurement and attribution of program performance
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Extracting compiler provenance from program binaries
Proceedings of the 9th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
Hybrid analysis and control of malware
RAID'10 Proceedings of the 13th international conference on Recent advances in intrusion detection
Recovering the toolchain provenance of binary code
Proceedings of the 2011 International Symposium on Software Testing and Analysis
Labeling library functions in stripped binaries
Proceedings of the 10th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools
Compiler help for binary manipulation tools
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Binary-code obfuscations in prevalent packer tools
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
We present a novel application of structured classification: identifying function entry points (FEPs, the starting byte of each function) in program binaries. Such identification is the crucial first step in analyzing many malicious, commercial and legacy software, which lack full symbol information that specifies FEPs. Existing pattern-matching FEP detection techniques are insufficient due to variable instruction sequences introduced by compiler and link-time optimizations. We formulate the FEP identification problem as structured classification using Conditional Random Fields. Our Conditional Random Fields incorporate both idiom features to represent the sequence of instructions surrounding FEPs, and control flow structure features to represent the interaction among FEPs. These features allow us to jointly label all FEPs in the binary. We perform feature selection and present an approximate inference method for massive program binaries. We evaluate our models on a large set of real-world test binaries, showing that our models dramatically outperform two existing, standard disassemblers.