The program dependence graph and its use in optimization
ACM Transactions on Programming Languages and Systems (TOPLAS)
LCLint: a tool for using specifications to check code
SIGSOFT '94 Proceedings of the 2nd ACM SIGSOFT symposium on Foundations of software engineering
Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
Bugs as deviant behavior: a general approach to inferring errors in systems code
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
An empirical study of operating systems errors
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
A system and language for building system-specific, static analyses
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
CCFinder: a multilinguistic token-based code clone detection system for large scale source code
IEEE Transactions on Software Engineering
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
On finding duplication and near-duplication in large software systems
WCRE '95 Proceedings of the Second Working Conference on Reverse Engineering
Clone Detection Using Abstract Syntax Trees
ICSM '98 Proceedings of the International Conference on Software Maintenance
Winnowing: local algorithms for document fingerprinting
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
OOPSLA '04 Companion to the 19th annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Managing Duplicated Code with Linked Editing
VLHCC '04 Proceedings of the 2004 IEEE Symposium on Visual Languages - Human Centric Computing
Similarity evaluation on tree-structured data
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
An empirical study of code clone genealogies
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
"Cloning Considered Harmful" Considered Harmful
WCRE '06 Proceedings of the 13th Working Conference on Reverse Engineering
DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones
ICSE '07 Proceedings of the 29th international conference on Software Engineering
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Tracking Code Clones in Evolving Software
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Checking system rules using system-specific, programmer-written compiler extensions
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
CP-Miner: a tool for finding copy-paste and related bugs in operating system code
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Context-based detection of clone-related bugs
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Scalable detection of semantic clones
Proceedings of the 30th international conference on Software engineering
An Empirical Study of Function Clones in Open Source Software
WCRE '08 Proceedings of the 2008 15th Working Conference on Reverse Engineering
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Code clone detection experience at microsoft
Proceedings of the 5th International Workshop on Software Clones
MeCC: memory comparison-based clone detector
Proceedings of the 33rd International Conference on Software Engineering
Understanding modern device drivers
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Understanding and detecting real-world performance bugs
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Characterizing logging practices in open-source software
Proceedings of the 34th International Conference on Software Engineering
CBCD: cloned buggy code detector
Proceedings of the 34th International Conference on Software Engineering
Active refinement of clone anomaly reports
Proceedings of the 34th International Conference on Software Engineering
How do software engineers understand code changes?: an exploratory study in industry
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
XIAO: tuning code clones at hands of engineers in practice
Proceedings of the 28th Annual Computer Security Applications Conference
A source-to-source transformation tool for error fixing
CASCON '13 Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research
Hi-index | 0.00 |
Software developers often duplicate source code to replicate functionality. This practice can hinder the maintenance of a software project: bugs may arise when two identical code segments are edited inconsistently. This paper presents DejaVu, a highly scalable system for detecting these general syntactic inconsistency bugs. DejaVu operates in two phases. Given a target code base, a parallel /inconsistent clone analysis/ first enumerates all groups of source code fragments that are similar but not identical. Next, an extensible /buggy change analysis/ framework refines these results, separating each group of inconsistent fragments into a fine-grained set of inconsistent changes and classifying each as benign or buggy. On a 75+ million line pre-production commercial code base, DejaVu executed in under five hours and produced a report of over 8,000 potential bugs. Our analysis of a sizable random sample suggests with high likelihood that at this report contains at least 2,000 true bugs and 1,000 code smells. These bugs draw from a diverse class of software defects and are often simple to correct: syntactic inconsistencies both indicate problems and suggest solutions.