Sim: a utility for detecting similarity in computer programs
SIGCSE '99 The proceedings of the thirtieth SIGCSE technical symposium on Computer science education
A fast algorithm for computing longest common subsequences
Communications of the ACM
A linear space algorithm for computing maximal common subsequences
Communications of the ACM
Hi-index | 0.00 |
cluster is a tool to partition a large pool of C programs into groups according to structural similarity. Its method involves calculating an alignment score for each program against a mosaic made of randomly selected code fragments of fixed size from the pool. The scores are then grouped together so that the distance between two adjacent members of a group is at most some threshold value. cluster is effective in identifying tight clusters of similar programs and is capable of distributing its workload over a network of workstations to achieve very fast running times. As a tool, cluster is highly configurable: the user can adjust its alignment scoring scheme and clustering threshold as well as obtain visual alignments of programs suspected to be similar.