Identifying static analysis techniques for finding non-fix hunks in fix revisions
Proceedings of the ACM first international workshop on Data-intensive software management and mining
The missing links: bugs and bug-fix commits
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Software intelligence: the future of mining software engineering data
Proceedings of the FSE/SDP workshop on Future of software engineering research
Evolution of the linux kernel variability model
SPLC'10 Proceedings of the 14th international conference on Software product lines: going beyond
A theory of branches as goals and virtual teams
Proceedings of the 4th International Workshop on Cooperative and Human Aspects of Software Engineering
An empirical analysis of the FixCache algorithm
Proceedings of the 8th Working Conference on Mining Software Repositories
Proceedings of the 8th Working Conference on Mining Software Repositories
Ownership, experience and defects: a fine-grained study of authorship
Proceedings of the 33rd International Conference on Software Engineering
Proceedings of the 2011 International Conference on Software and Systems Process
Historage: fine-grained version control system for Java
Proceedings of the 12th International Workshop on Principles of Software Evolution and the 7th annual ERCIM Workshop on Software Evolution
Cohesive and isolated development with branches
FASE'12 Proceedings of the 15th international conference on Fundamental Approaches to Software Engineering
Refining code ownership with synchronous changes
Empirical Software Engineering
The effect of branching strategies on software quality
Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement
Assessing the value of branches with what-if analysis
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Integrating systematic exploration, analysis, and maintenance in software development
Proceedings of the 2013 International Conference on Software Engineering
Will my patch make it? and how fast?: case study on the Linux kernel
Proceedings of the 10th Working Conference on Mining Software Repositories
Linux variability anomalies: what causes them and how do they get fixed?
Proceedings of the 10th Working Conference on Mining Software Repositories
The MSR cookbook: mining a decade of research
Proceedings of the 10th Working Conference on Mining Software Repositories
The empirical commit frequency distribution of open source projects
Proceedings of the 9th International Symposium on Open Collaboration
Proceedings of the 45th ACM technical symposium on Computer science education
A multidimensional empirical study on refactoring activity
CASCON '13 Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research
Hi-index | 0.00 |
We are now witnessing the rapid growth of decentralized source code management (DSCM) systems, in which every developer has her own repository. DSCMs facilitate a style of collaboration in which work output can flow sideways (and privately) between collaborators, rather than always up and down (and publicly) via a central repository. Decentralization comes with both the promise of new data and the peril of its misinterpretation. We focus on git, a very popular DSCM used in high-profile projects. Decentralization, and other features of git, such as automatically recorded contributor attribution, lead to richer content histories, giving rise to new questions such as “How do contributions flow between developers to the official project repository?” However, there are pitfalls. Commits may be reordered, deleted, or edited as they move between repositories. The semantics of terms common to SCMs and DSCMs sometimes differ markedly, potentially creating confusion. For example, a commit is immediately visible to all developers in centralized SCMs, but not in DSCMs. Our goal is to help researchers interested in DSCMs avoid these and other perils when mining and analyzing git data.