Categorizing bugs with social networks: a case study on four open source software communities

Authors:
Marcelo Serrano Zanetti;Ingo Scholtes;Claudio Juan Tessone;Frank Schweitzer
Affiliations:
ETH Zurich, Switzerland;ETH Zurich, Switzerland;ETH Zurich, Switzerland;ETH Zurich, Switzerland
Venue:
Proceedings of the 2013 International Conference on Software Engineering
Year:
2013

Citing 24
Cited 0

Two case studies of open source software development: Apache and Mozilla

ACM Transactions on Software Engineering and Methodology (TOSEM)
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Automated support for classifying software failure reports

Proceedings of the 25th International Conference on Software Engineering
An Approach to Classify Software Maintenance Requests

ICSM '02 Proceedings of the International Conference on Software Maintenance (ICSM'02)
Bugzilla, ITracker, and Other Bug Trackers

IEEE Software
Automating bug report assignment

Proceedings of the 28th international conference on Software engineering
Detection of Duplicate Defect Reports Using Natural Language Processing

ICSE '07 Proceedings of the 29th international conference on Software Engineering
Modeling bug report quality

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
An approach to detecting duplicate bug reports using natural language and execution information

Proceedings of the 30th international conference on Software engineering
Socio-technical congruence: a framework for assessing the impact of technical and work dependencies on software development productivity

Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
What makes a good bug report?

Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
Mining Task-Based Social Networks to Explore Collaboration in Software Teams

IEEE Software
Predicting build failures using social network analysis on developer communication

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Characterizing and predicting which bugs get fixed: an empirical study of Microsoft Windows

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Networks: An Introduction

Networks: An Introduction
Studying the Impact of Social Structures on Software Quality

ICPC '10 Proceedings of the 2010 IEEE 18th International Conference on Program Comprehension
Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging

ICSM '10 Proceedings of the 2010 IEEE International Conference on Software Maintenance
Predicting Re-opened Bugs: A Case Study on the Eclipse Project

WCRE '10 Proceedings of the 2010 17th Working Conference on Reverse Engineering
Beyond fixing bugs: case studies of creative collaboration in open source software bug fixing processes

C&C '11 Proceedings of the 8th ACM conference on Creativity and cognition
Automatic categorization of bug reports using latent Dirichlet allocation

Proceedings of the 5th India Software Engineering Conference
All-for-one and one-for-all?: a multi-level analysis of communication patterns and individual performance in geographically distributed software development

Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work
Developer prioritization in bug repositories

Proceedings of the 34th International Conference on Software Engineering
Graph-based analysis and prediction for software evolution

Proceedings of the 34th International Conference on Software Engineering
Characterizing and predicting which bugs get reopened

Proceedings of the 34th International Conference on Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Efficient bug triaging procedures are an important precondition for successful collaborative software engineering projects. Triaging bugs can become a laborious task particularly in open source software (OSS) projects with a large base of comparably inexperienced part-time contributors. In this paper, we propose an efficient and practical method to identify valid bug reports which a) refer to an actual software bug, b) are not duplicates and c) contain enough information to be processed right away. Our classification is based on nine measures to quantify the social embeddedness of bug reporters in the collaboration network. We demonstrate its applicability in a case study, using a comprehensive data set of more than 700,000 bug reports obtained from the Bugzilla installation of four major OSS communities, for a period of more than ten years. For those projects that exhibit the lowest fraction of valid bug reports, we find that the bug reporters' position in the collaboration network is a strong indicator for the quality of bug reports. Based on this finding, we develop an automated classification scheme that can easily be integrated into bug tracking platforms and analyze its performance in the considered OSS communities. A support vector machine (SVM) to identify valid bug reports based on the nine measures yields a precision of up to 90.3% with an associated recall of 38.9%. With this, we significantly improve the results obtained in previous case studies for an automated early identification of bugs that are eventually fixed. Furthermore, our study highlights the potential of using quantitative measures of social organization in collaborative software engineering. It also opens a broad perspective for the integration of social awareness in the design of support infrastructures.