Statistical significance testing: a panacea for software technology experiments?

  • Authors:
  • James Miller

  • Affiliations:
  • Software Technology, Engineering and Measurement research center (STEAM), Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada T6H 5M3

  • Venue:
  • Journal of Systems and Software - Special issue: Applications of statistics in software engineering
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Empirical software engineering has a long history of utilizing statistical significance testing, and in many ways, it has become the backbone of the topic. What is less obvious is how much consideration has been given to its adoption. Statistical significance testing was initially designed for testing hypotheses in a very different area, and hence the question must be asked: does it transfer into empirical software engineering research? This paper attempts to address this question. The paper finds that this transference is far from straightforward, resulting in several problems in its deployment within the area. Principally problems exist in: formulating hypotheses, the calculation of the probability values and its associated cut-off value, and the construction of the sample and its distribation. Hence, the paper concludes that the topic should explore other avenues of analysis, in an attempt to establish which analysis approaches are preferable under which conditions, when conducting empirical software engineering studies.