How to build repeatable experiments

  • Authors:
  • Gregory Gay;Tim Menzies;Bojan Cukic;Burak Turhan

  • Affiliations:
  • WVU, Morgantown, WVU;WVU, Morgantown, WVU;WVU, Morgantown, WVU;NRC Institute for Information Technology, Ottawa, Canada

  • Venue:
  • PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The mantra of the PROMISE series is "repeatable, improvable, maybe refutable" software engineering experiments. This community has successfully created a library of reusable software engineering data sets. The next challenge in the PROMISE community will be to not only share data, but to share experiments. Our experience with existing data mining environments is that these tools are not suitable for publishing or sharing repeatable experiments. OURMINE is an environment for the development of data mining experiments. OURMINE offers a succinct notation for describing experiments. Adding new tools to OURMINE, in a variety of languages, is a rapid and simple process. This makes it a useful research tool. Complicated graphical interfaces have been eschewed for simple command-line prompts. This simplifies the learning curve for data mining novices. The simplicity also encourages large scale modification and experimentation with the code. In this paper, we show the OURMINE code required to reproduce a recent experiment checking how defect predictors learned from one site apply to another. This is an important result for the PROMISE community since it shows that our shared repository is not just a useful academic resource. Rather, it is a valuable resource industry: companies that lack the local data required to build those predictors can use PROMISE data to build defect predictors.