Comparing Approaches to Mining Source Code for Call-Usage Patterns

Authors:
Huzefa Kagdi;Michael L. Collard;Jonathan I. Maletic
Affiliations:
Kent State University, USA;Ashland University, USA;Kent State University, USA
Venue:
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Year:
2007

Citing 14
Cited 2

Data mining library reuse patterns using generalized association rules

Proceedings of the 22nd international conference on Software engineering
Mining specifications

POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Automatic extraction of object-oriented component interfaces

ISSTA '02 Proceedings of the 2002 ACM SIGSOFT international symposium on Software testing and analysis
Invariant inference for static checking:

Proceedings of the 10th ACM SIGSOFT symposium on Foundations of software engineering
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Predicting Source Code Changes by Mining Change History

IEEE Transactions on Software Engineering
Visual data mining in software archives

SoftVis '05 Proceedings of the 2005 ACM symposium on Software visualization
Mining Version Histories to Guide Software Changes

IEEE Transactions on Software Engineering
DynaMine: finding common error patterns by mining software revision histories

Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code

Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Recovering system specific rules from software repositories

MSR '05 Proceedings of the 2005 international workshop on Mining software repositories
Perracotta: mining temporal API rules from imperfect traces

Proceedings of the 28th international conference on Software engineering
Mining sequences of changed-files from version histories

Proceedings of the 2006 international workshop on Mining software repositories
MAPO: mining API usages from open source repositories

Proceedings of the 2006 international workshop on Mining software repositories

An approach to mining call-usage patternswith syntactic context

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Towards a general purpose architecture for UI generation

Journal of Systems and Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

Two approaches for mining function-call usage patterns from source code are compared. The first approach, itemset mining, has recently been applied to this problem. The other approach, sequential-pattern mining, has not been previously applied to this problem. Here, a call-usage pattern is a composition of function calls that occur in a function definition. Both approaches look for frequently occurring patterns that represent standard usage of functions and identify possible errors. Itemset mining produces unordered patterns, i.e., sets of function calls, whereas, sequential-pattern mining produces partially ordered patterns, i.e., sequences of function calls. The trade-off between the additional ordering context given by sequential-pattern mining and the efficiency of itemset mining is investigated. The two approaches are applied to the Linux kernel v2.6.14 and results show that mining ordered patterns is worth the additional cost.