Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
CCFinder: a multilinguistic token-based code clone detection system for large scale source code
IEEE Transactions on Software Engineering
Disassembly of Executable Code Revisited
WCRE '02 Proceedings of the Ninth Working Conference on Reverse Engineering (WCRE'02)
K-gram based software birthmarks
Proceedings of the 2005 ACM symposium on Applied computing
An API for Runtime Code Patching
International Journal of High Performance Computing Applications
GPLAG: detection of software plagiarism by program dependence graph analysis
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
MUDABlue: an automatic categorization system for open source repositories
Journal of Systems and Software - Special issue: Selected papers from the 11th Asia Pacific software engineering conference (APSEC 2004)
DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones
ICSE '07 Proceedings of the 29th international conference on Software Engineering
CP-Miner: a tool for finding copy-paste and related bugs in operating system code
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Large-Scale Code Reuse in Open Source Software
FLOSS '07 Proceedings of the First International Workshop on Emerging Trends in FLOSS Research and Development
Combining symbolic execution with model checking to verify parallel numerical programs
ACM Transactions on Software Engineering and Methodology (TOSEM)
Opcodes as predictor for malware
International Journal of Electronic Security and Digital Forensics
Detecting code clones in binary executables
Proceedings of the eighteenth international symposium on Software testing and analysis
Finding software license violations through binary code clone detection
Proceedings of the 8th Working Conference on Mining Software Repositories
Polymorphic worm detection using structural information of executables
RAID'05 Proceedings of the 8th international conference on Recent Advances in Intrusion Detection
Detecting similar software applications
Proceedings of the 34th International Conference on Software Engineering
XIAO: tuning code clones at hands of engineers in practice
Proceedings of the 28th Annual Computer Security Applications Conference
Towards automatic software lineage inference
SEC'13 Proceedings of the 22nd USENIX conference on Security
Hi-index | 0.00 |
The problem of matching between binaries is important for software copyright enforcement as well as for identifying disclosed vulnerabilities in software. We present a search engine prototype called Rendezvous which enables indexing and searching for code in binary form. Rendezvous identifies binary code using a statistical model comprising instruction mnemonics, control flow sub-graphs and data constants which are simple to extract from a disassembly, yet normalising with respect to different compilers and optimisations. Experiments show that Rendezvous achieves F2 measures of 86.7% and 83.0% on the GNU C library compiled with different compiler optimisations and the GNU coreutils suite compiled with gcc and clang respectively. These two code bases together comprise more than one million lines of code. Rendezvous will bring significant changes to the way patch management and copyright enforcement is currently performed.