Context-bounded translations for concurrent software: an empirical evaluation

  • Authors:
  • Naghmeh Ghafari;Alan J. Hu;Zvonimir Rakamarić

  • Affiliations:
  • Critical Systems Labs, Vancouver, BC, Canada;Department of Computer Science, University of British Columbia, Canada;Department of Computer Science, University of British Columbia, Canada

  • Venue:
  • SPIN'10 Proceedings of the 17th international SPIN conference on Model checking software
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Context-Bounded Analysis has emerged as a practical automatic formal analysis technique for fine-grained, shared-memory concurrent software. Two recent papers (in CAV 2008 and 2009) have proposed ingenious translation approaches that promise much better scalability, backed by compelling, but differing, theoretical and conceptual advantages. Empirical evidence comparing the translations, however, has been lacking. Furthermore, these papers focused exclusively on Boolean model checking, ignoring the also widely used paradigm of verification-condition checking. In this paper, we undertake a methodical, empirical evaluation of the three main source-to-source translations for context-bounded analysis of concurrent software, in a verification-condition-checking paradigm. We evaluate their scalability under a wide range of experimental conditions. Our results show: (1) The newest, CAV 2009 translation is the clear loser, with the CAV 2008 translation the best in most instances, but the oldest, brute-force translation doing surprisingly well. Clearly, previous results for Boolean model checking do not apply to verification-condition checking. (2) Disturbingly, confounding factors in the experimental design can change the relative performance of the translations, highlighting the importance of extensive and thorough experiments. For example, using a different (slower) SMT solver changes the relative ranking of the translations, potentially misleading researchers and practitioners to use an inferior translation. (3) SMT runtimes grow exponentially with verification-condition length, but different translations and parameters give different exponential curves. This suggests that the practical scalability of a translation scheme might be estimated by combining the size of the queries with an empirical or theoretical measure of the complexity of solving that class of query.