A case study of open source software development: the Apache server
Proceedings of the 22nd international conference on Software engineering
Analysis of the Effects of Software Reuse on Customer Satisfaction in an RPG Environment
IEEE Transactions on Software Engineering
Using self-organizing maps to analyze object-oriented software measures
Journal of Systems and Software
Evolution patterns of open-source software systems and communities
Proceedings of the International Workshop on Principles of Software Evolution
Understanding open source software development
Understanding open source software development
Two case studies of open source software development: Apache and Mozilla
ACM Transactions on Software Engineering and Methodology (TOSEM)
Toward an understanding of the motivation Open Source Software developers
Proceedings of the 25th International Conference on Software Engineering
A relational approach to software metrics
Proceedings of the 2004 ACM symposium on Applied computing
An Empirical Study of Open-Source and Closed-Source Software Products
IEEE Transactions on Software Engineering
Selecting components in large COTS repositories
Journal of Systems and Software - Special issue: Applications of statistics in software engineering
A non-invasive approach to product metrics collection
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: AGILE methodologies for software production
Self-organization of teams for free/libre open source software development
Information and Software Technology
Forming consensus in the networks of knowledge
Engineering Applications of Artificial Intelligence
Mining CVS Repositories to Understand Open-Source Project Developer Roles
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Automated Identification of Tasks in Development Sessions
ICPC '08 Proceedings of the 2008 The 16th IEEE International Conference on Program Comprehension
Brief paper: Experience-consistent modeling: Regression and classification problems
Automatica (Journal of IFAC)
ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
So near and yet so far: New insight into properties of some well-known classifier paradigms
Information Sciences: an International Journal
Divergence statistics for testing uniform association in cross-classifications
Information Sciences: an International Journal
Toward a better understanding of tool usage (NIER track)
Proceedings of the 33rd International Conference on Software Engineering
Identification of defect-prone classes in telecommunication software systems using design metrics
Information Sciences: an International Journal
A Multifaceted Perspective at Data Analysis: A Study in Collaborative Intelligent Agents
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A genetic reduction of feature space in the design of fuzzy models
Applied Soft Computing
Proceedings of the 34th International Conference on Software Engineering
DroidSense: a mobile tool to analyze software development processes by measuring team proximity
TOOLS'12 Proceedings of the 50th international conference on Objects, Models, Components, Patterns
Failure prediction based on log files using Random Indexing and Support Vector Machines
Journal of Systems and Software
Hi-index | 0.07 |
Open source software development is becoming always more relevant. Understanding the behavior of developers in open source software projects and identifying the kinds of their contributions is an essential step to improve the efficiency of the development process and to organize the development teams more effectively. Moreover, understanding the level of participation of the different developers helps to understand which members of the development team are more important than others and who are the actual key developers. This paper investigates the behavior of open source developers and the structure of the development of open source projects through the analysis of a very large dataset: 10 well-known and widely used open source software projects for a total of more than 4 MLOC (millions of lines of code) modified distributed in more than 200K versions. This study builds on the top of other studies in this area applying a set of rigorous statistical techniques, analyzing how developers contribute to the projects. Its novelty is in the fine gain analysis of the developers that have commit rights on the repository of the project they work on, in the automated identification of key contributors of the project, in the size of the analyzed datasets, and in the statistical techniques used to classify the behavior of the developers in an automated way. To collect such large volume of data and to ensure their integrity, a tool to automatically mine open source version control systems has been used. The main result of this study is the identification of a recurrent pattern of four kinds of contributors with the same characteristics in all the projects analyzed even if the projects are very different in domain, size, language, etc.