Programming style authorship analysis
CSC '89 Proceedings of the 17th conference on ACM Annual Computer Science Conference
Journal of Systems and Software
Software forensics: can we track code to its authors?
Computers and Security
Java for C/C++ programmers
C++ for Java Programmers
SAS/ETS User's Guide, Version 6
SAS/ETS User's Guide, Version 6
Software forensics: old methods for a new science
SEEP '96 Proceedings of the 1996 International Conference on Software Engineering: Education and Practice (SE:EP '96)
IDENTIFIED: A Dictionary-Based System for Extracting Source Code Metrics for Software Forensics
SEEP '98 Proceedings of the 1998 International Conference on Software Engineering: Education & Practice
Effective identification of source code authors using byte-level information
Proceedings of the 28th international conference on Software engineering
Proceedings of the 9th annual conference on Genetic and evolutionary computation
Detecting outsourced student programming assignments
Journal of Computing Sciences in Colleges
ACM Transactions on Information Systems (TOIS)
Examining the significance of high-level programming features in source code author classification
Journal of Systems and Software
Stylometric Identification in Electronic Markets: Scalability and Robustness
Journal of Management Information Systems
Application of Information Retrieval Techniques for Source Code Authorship Attribution
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Code analyzer for an online course management system
Journal of Systems and Software
Hi-index | 0.00 |
Computer programs belong to the authors who design, write, and test them. Authorship identification is concerned with determining the likelihood of a particular author having written some piece(s) of code. usually based on other code samples from the same programmer. Java is a popular object-oriented computer programming language. Programming fingerprints attempt to characterize the features that are unique to each programmer. In this study, we investigated the extraction of a set of software metrics of a given Java source code--by a program written in Visual C++ that could be used as a fingerprint to identify the author of the Java code. The contributions of the selected metrics to authorship identification were measured by a statistical process, namely canonical discriminant analysis, using the statistical software package SAS. Out of the 56 extracted metrics, 48 metrics were identified as being contributive to authorship identification. The authorship of 62.6-67.2% of the Java programs considered could be correctly identified with the extracted metrics. The identification rate could be as high as 85.8%, with derived canonical variates. Moreover. layout metrics played a more important role in authorship identification than the other metrics.