On finding duplication and near-duplication in large software systems
WCRE '95 Proceedings of the Second Working Conference on Reverse Engineering
Generating Robust Parsers using Island Grammars
WCRE '01 Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE'01)
Imposing a Memory Management Discipline on Software Deployment
Proceedings of the 26th International Conference on Software Engineering
On the effectiveness of clone detection by string matching: Research Articles
Journal of Software Maintenance and Evolution: Research and Practice
ICSE '07 Proceedings of the 29th international conference on Software Engineering
"Cloning considered harmful" considered harmful: patterns of cloning in software
Empirical Software Engineering
Detecting code clones in binary executables
Proceedings of the eighteenth international symposium on Software testing and analysis
Code siblings: Technical and legal implications of copying code between applications
MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
Automated classification and analysis of internet malware
RAID'07 Proceedings of the 10th international conference on Recent advances in intrusion detection
An exploratory study of the evolution of software licensing
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
A sentence-matching method for automatic license identification of source code files
Proceedings of the IEEE/ACM international conference on Automated software engineering
A study of the uniqueness of source code
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
From Whence It Came: Detecting Source Code Clones by Analyzing Assembler
WCRE '10 Proceedings of the 2010 17th Working Conference on Reverse Engineering
IEEE Transactions on Information Theory
Rendezvous: a search engine for binary code
Proceedings of the 10th Working Conference on Mining Software Repositories
Replicating mining studies with SOFAS
Proceedings of the 10th Working Conference on Mining Software Repositories
Empirical Software Engineering
Hi-index | 0.00 |
Software released in binary form frequently uses third-party packages without respecting their licensing terms. For instance, many consumer devices have firmware containing the Linux kernel, without the suppliers following the requirements of the GNU General Public License. Such license violations are often accidental, e.g., when vendors receive binary code from their suppliers with no indication of its provenance. To help find such violations, we have developed the Binary Analysis Tool (BAT), a system for code clone detection in binaries. Given a binary, such as a firmware image, it attempts to detect cloning of code from repositories of packages in source and binary form. We evaluate and compare the effectiveness of three of BAT's clone detection techniques: scanning for string literals, detecting similarity through data compression, and detecting similarity by computing binary deltas.