Programming style authorship analysis
CSC '89 Proceedings of the 17th conference on ACM Annual Computer Science Conference
Winnowing: local algorithms for document fingerprinting
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Extraction of Java program fingerprints for software authorship identification
Journal of Systems and Software
Academic dishonesty and the Internet
Communications of the ACM - The digital society
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Shared information and program plagiarism detection
IEEE Transactions on Information Theory
Application of Information Retrieval Techniques for Source Code Authorship Attribution
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Algorithm recognition by static analysis and its application in students' submissions assessment
Koli '08 Proceedings of the 8th International Conference on Computing Education Research
Proceedings of the 12th Koli Calling International Conference on Computing Education Research
Source code author identification with unsupervised feature learning
Pattern Recognition Letters
Hi-index | 0.00 |
The task of writing computer programs outside of class is the most realistic experience students have in a programming class and hence can be the most accurate evaluation of their ability. However some students hire outside parties to produce these programs. We present a data mining and machine learning approach that can provide objective evidence for detecting such instances. Based on programs submitted by students across two lower-level CS (Computer Science) courses, we extract some basic programming style metrics. A decision tree model built on the collected measurements yields relatively good detection accuracy. In addition, an investigation into relative importance of the basic style metrics was performed which indicated Lines of Code, Number of Variables, and Number of Comments as important attributes. The methods are being implemented in a software analysis tool that instructors could possibly use for detecting outsourced program submissions.