Automatic recognition of students' sorting algorithm implementations in a data structures and algorithms course

Authors:
Ahmad Taherkhani;Ari Korhonen;Lauri Malmi
Affiliations:
Aalto University, FI, Aalto, Finland;Aalto University, Aalto, Finland;Aalto University, Aalto, Finland
Venue:
Proceedings of the 12th Koli Calling International Conference on Computing Education Research
Year:
2012

Citing 32
Cited 3

Beacons in computer program comprehension

International Journal of Man-Machine Studies
Identifying syntactic differences between two programs

Software—Practice & Experience
C4.5: programs for machine learning

C4.5: programs for machine learning
A memory-based approach to recognizing programming plans

Communications of the ACM
Fully automatic assessment of programming exercises

Proceedings of the 6th annual conference on Innovation and technology in computer science education
The marking system for CourseMaster

Proceedings of the 7th annual conference on Innovation and technology in computer science education
Knowledge-Based Program Analysis

IEEE Software
Experiment on the Automatic Detection of Function Clones in a Software System Using Metrics

ICSM '96 Proceedings of the 1996 International Conference on Software Maintenance
Using Slicing to Identify Duplication in Source Code

SAS '01 Proceedings of the 8th International Symposium on Static Analysis
An Empirical Analysis of Roles of Variables in Novice-Level Procedural Programs

HCC '02 Proceedings of the IEEE 2002 Symposia on Human Centric Computing Languages and Environments (HCC'02)
PROUST: Knowledge-based program understanding

ICSE '84 Proceedings of the 7th international conference on Software engineering
On finding duplication and near-duplication in large software systems

WCRE '95 Proceedings of the Second Working Conference on Reverse Engineering
Identifying Similar Code with Program Dependence Graphs

WCRE '01 Proceedings of the Eighth Working Conference on Reverse Engineering (WCRE'01)
Clone Detection Using Abstract Syntax Trees

ICSM '98 Proceedings of the International Conference on Software Maintenance
A Language Independent Approach for Detecting Duplicated Code

ICSM '99 Proceedings of the IEEE International Conference on Software Maintenance
Identification of High-Level Concept Clones in Source Code

Proceedings of the 16th IEEE international conference on Automated software engineering
Rethinking computer science education from a test-first perspective

OOPSLA '03 Companion of the 18th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Assessment Process for Programming Assignments

ICALT '04 Proceedings of the IEEE International Conference on Advanced Learning Technologies
Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)
A web-based service for the automatic detection of roles of variables

Proceedings of the 11th annual SIGCSE conference on Innovation and technology in computer science education
Theories, tools and research methods in program comprehension: past, present and future

Software Quality Control
The boss online submission and assessment system

Journal on Educational Resources in Computing (JERIC)
Efficient token based clone detection with flexible tokenization

Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Detecting outsourced student programming assignments

Journal of Computing Sciences in Colleges
A Complexity Measure

IEEE Transactions on Software Engineering
Comparison and Evaluation of Clone Detection Tools

IEEE Transactions on Software Engineering
Comparison and evaluation of code clone detection techniques and tools: A qualitative approach

Science of Computer Programming
Review of recent systems for automatic assessment of programming assignments

Proceedings of the 10th Koli Calling International Conference on Computing Education Research
An introduction to program comprehension for computer science educators

Proceedings of the 2010 ITiCSE working group reports
Recognizing Algorithms Using Language Constructs, Software Metrics and Roles of Variables

The Computer Journal
Using Decision Tree Classifiers in Source Code Analysis to Recognize Algorithms

The Computer Journal
Empirical Studies of Programming Knowledge

IEEE Transactions on Software Engineering

Toward facilitating assistance to students attempting engineering design problems

Proceedings of the ninth annual international ACM conference on International computing education research
Visualizing and classifying multiple solutions to engineering design problems

Proceedings of the ninth annual international ACM conference on International computing education research
Feature engineering for clustering student solutions

Proceedings of the first ACM conference on Learning @ scale conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computing educators often rely on black-box analysis to assess students' work automatically and give feedback. This approach does not allow analyzing the quality of programs and checking if they implement the required algorithm. We introduce an instrument for recognizing and classifying algorithms (Aari) in terms of white-box testing to identify authentic students' sorting algorithm implementations in a data structures and algorithms course. Aari uses machine learning techniques to classify new instances. The students were asked to submit a program to sort an array of integers in two rounds: at the beginning of the course before sorting algorithms were introduced, and after taking a lecture on sorting algorithms. We evaluated the performance of Aari with the implementations of each round separately. The results show that the sorting algorithms, which Aari has been trained to recognize, are recognized with an average accuracy of about 90%. When considering all the submitted sorting algorithm implementations (including the variations of the standard algorithms), Aari achieved an overall accuracy of 71% and 81% for the first and second round, respectively. In addition, we analyzed the students' implementations manually to gain a better understanding of the reasons of failure in the recognition process. This analysis revealed that students have many misconceptions related to sorting algorithms, which results in problematic implementations that are more inefficient compared with those of standard algorithms. We discuss these variations along with the application of the tool in an educational context, its limitations and some directions for future work.