Healing data races on-the-fly

  • Authors:
  • Bohuslav Krena;Zdenek Letko;Rachel Tzoref;Shmuel Ur;Tomáš Vojnar

  • Affiliations:
  • Brno University of Technology, Czech Republic;Brno University of Technology, Czech Republic;Haifa University Campus, Haifa, Israel;Haifa University Campus, Haifa, Israel;Brno University of Technology, Czech Republic

  • Venue:
  • Proceedings of the 2007 ACM workshop on Parallel and distributed systems: testing and debugging
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Testing of concurrent software is extremely difficult. Despite all the progress in the testing and verification technology, concurrent bugs, the most common of which are deadlocks and races, make it to the field. This paper describes a set of techniques, implemented in a tool called ConTest, allowing concurrent programs to self-heal at run-time. Concurrent bugs have the very desirable property for healing that some of the interleaving produce correct results while in others bugs manifest. Healing concurrency problems is about limiting, or changing the probability of interleaving, such that bugs will be seen less. When healing concurrent programs, if a deadlock does not result from limiting the interleaving, we are sure that the result of the healed program could have been in the original program and therefore no new functional bug has been introduced. In this initial work which deals with different types of data races, we suggest three types of healing mechanisms: (1) changing the probability of interleaving by introducing sleep or yield statements or by changing thread priorities, (2) removing interleaving using synchronisation commands like locking and unlocking certain mutexes or waits and notifies, and (3) removing the result of "bad interleaving" by replacing the value of variables by the one that "should" have been taken. We also classify races according to the relevant healing strategies to apply.