Mining sequences of changed-files from version histories

Authors:
Huzefa Kagdi;Shehnaaz Yusuf;Jonathan I. Maletic
Affiliations:
Kent State University, Kent, Ohio;Kent State University, Kent, Ohio;Kent State University, Kent, Ohio
Venue:
Proceedings of the 2006 international workshop on Mining software repositories
Year:
2006

Citing 17
Cited 10

SPADE: an efficient algorithm for mining frequent sequences

Machine Learning
Two case studies of open source software development: Apache and Mozilla

ACM Transactions on Software Engineering and Methodology (TOSEM)
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
CVSSearch: Searching through Source Code using CVS Comments

ICSM '01 Proceedings of the IEEE International Conference on Software Maintenance (ICSM'01)
Detection of Logical Coupling Based on Product Release History

ICSM '98 Proceedings of the International Conference on Software Maintenance
An Integrated Approach for Studying Architectural Evolution

IWPC '02 Proceedings of the 10th International Workshop on Program Comprehension
Understanding Change-Proneness in OO Software through Visualization

IWPC '03 Proceedings of the 11th IEEE International Workshop on Program Comprehension
Mining Version Histories to Guide Software Changes

Proceedings of the 26th International Conference on Software Engineering
Predicting Source Code Changes by Mining Change History

IEEE Transactions on Software Engineering
Predicting Change Propagation in Software Systems

ICSM '04 Proceedings of the 20th IEEE International Conference on Software Maintenance
An Empirical Study of Fine-Grained Software Modifications

ICSM '04 Proceedings of the 20th IEEE International Conference on Software Maintenance
Studying Software Evolution Information by Visualizing the Change History

ICSM '04 Proceedings of the 20th IEEE International Conference on Software Maintenance
Visual data mining in software archives

SoftVis '05 Proceedings of the 2005 ACM symposium on Software visualization
Clustering Software Artifacts Based on Frequent Common Changes

IWPC '05 Proceedings of the 13th International Workshop on Program Comprehension
Mining Version Histories to Guide Software Changes

IEEE Transactions on Software Engineering
The FreeBSD Project: A Replication Case Study of Open Source Development

IEEE Transactions on Software Engineering
Mining version histories to verify the learning process of Legitimate Peripheral Participants

MSR '05 Proceedings of the 2005 international workshop on Mining software repositories

Combining Single-Version and Evolutionary Dependencies for Software-Change Prediction

MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Comparing Approaches to Mining Source Code for Call-Usage Patterns

MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Improving change prediction with fine-grained source code mining

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
A survey and taxonomy of approaches for mining software repositories in the context of software evolution

Journal of Software Maintenance and Evolution: Research and Practice
A segmentation-based approach for temporal analysis of software version repositories

Journal of Software Maintenance and Evolution: Research and Practice
Towards a more efficient static software change impact analysis method

Proceedings of the 8th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
Applications of data mining in software engineering

International Journal of Data Analysis Techniques and Strategies
Software evolution modelling: an approach for change impact analysis

Proceedings of the 7th International Conference on Frontiers of Information Technology
The MSR cookbook: mining a decade of research

Proceedings of the 10th Working Conference on Mining Software Repositories
An Empirical Analysis of Software Changes on Statement Entity in Java Open Source Projects

International Journal of Open Source Software and Processes

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern source-control systems, such as Subversion, preserve change-sets of files as atomic commits. However, the specific ordering information in which files were changed is typically not found in these source-code repositories. In this paper, a set of heuristics for grouping change-sets (i.e., log-entries) found in source-code repositories is presented. Given such groups of change-sets, sequences of files that frequently change together are uncovered. This approach not only gives the (unordered) sets of files but supplements them with (partial temporal) ordering information. The technique is demonstrated on a subset of KDE source-code repository. The results show that the approach is able to find sequences of changed-files.