Simulating families of studies to build confidence in defect hypotheses

Authors:
Forrest Shull;Daniela Cruzes;Victor Basili;Manoel Mendonça
Affiliations:
Fraunhofer Center-Maryland, 4321 Hartwick Road, Suite 500, College Park, MD 20740, USA;Department of Computer Science, University of Maryland, College Park, MD 20742, USA and Computer Networks Research Group (NUPERC), Salvador University (UNIFACS), Rua Ponciano de Oliveira, 126 Salv ...;Department of Computer Science, University of Maryland, College Park, MD 20742, USA;Computer Networks Research Group (NUPERC), Salvador University (UNIFACS), Rua Ponciano de Oliveira, 126 Salvador, BA 41950-275, Brazil
Venue:
Information and Software Technology
Year:
2005

Citing 20
Cited 1

Software errors and complexity: an empirical investigation0

Communications of the ACM
Collecting and categorizing software error data in an industrial environment

Journal of Systems and Software - Special issue on the fifth Minnowbrook workshop on software performance evaluation
Identifying Error-Prone Software An Empirical Study

IEEE Transactions on Software Engineering
Software engineering standards

Software engineering standards
Analyzing Error-Prone System Structure

IEEE Transactions on Software Engineering
Tree visualization with tree-maps: 2-d space-filling approach

ACM Transactions on Graphics (TOG)
Using the GQM paradigm to investigate influential factors for software process improvement

Journal of Systems and Software
Building Knowledge through Families of Experiments

IEEE Transactions on Software Engineering
Information Visualization and Visual Data Mining

IEEE Transactions on Visualization and Computer Graphics
Error Density and Size in Ada Software

IEEE Software
Software engineering in avionics applications

ICSE '78 Proceedings of the 3rd international conference on Software engineering
Experiments with computer software complexity and reliability

ICSE '82 Proceedings of the 6th international conference on Software engineering
Quantitative aspects of software validation

Proceedings of the international conference on Reliable software
Tree-Maps: a space-filling approach to the visualization of hierarchical information structures

VIS '91 Proceedings of the 2nd conference on Visualization '91
Evidence-Based Software Engineering

Proceedings of the 26th International Conference on Software Engineering
Generating testable hypotheses from tacit knowledge for high productivity computing

Proceedings of the second international workshop on Software engineering for high performance computing system applications
Evaluating Software Development by Analysis of Changes: Some Data from the Software Engineering Laboratory

IEEE Transactions on Software Engineering
Persistent Software Errors

IEEE Transactions on Software Engineering
Replicating software engineering experiments: a poisoned chalice or the Holy Grail

Information and Software Technology
From visual data exploration to visual data mining: a survey

IEEE Transactions on Visualization and Computer Graphics

A conceptual model to address threats to validity in controlled experiments

Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

While it is clear that there are many sources of variation from one development context to another, it is not clear a priori what specific variables will influence the effectiveness of a process in a given context. For this reason, we argue that knowledge about software process must be built from families of studies, in which related studies are run within similar contexts as well as very different ones. Previous papers have discussed how to design related studies so as to document as precisely as possible the values of likely context variables and be able to compare with those observed in new studies. While such a planned approach is important, we argue that an opportunistic approach is also practical. The approach would combine results from multiple individual studies after the fact, enabling recommendations to be made about process effectiveness in context. In this paper, we describe two processes with which we have been working to build empirical knowledge about software development processes: one is a manual and informal approach, which relies on identifying common beliefs or 'folklore' to identify useful hypotheses and a manual analysis of the information in papers to investigate whether there is support for those hypotheses; the other is a formal approach based around encoding the information in papers into a structured hypothesis base that can then be searched to organize hypotheses and their associated support. We test these processes by applying them to build knowledge in the area of defect folklore (i.e. commonly accepted heuristics about software defects and their behavior). We show that the formal methodology can produce useful and feasible results, especially when it is compared to the results output from the more manual, expert-based approach. The formalized approach, by relying on a reusable hypothesis base, is repeatable and also capable of producing a more thorough basis of support for hypotheses, including results from papers or articles that may have been overlooked or not considered by the experts.