The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Eraser: a dynamic data race detector for multithreaded programs
ACM Transactions on Computer Systems (TOCS)
RecPlay: a fully integrated practical record/replay system
ACM Transactions on Computer Systems (TOCS)
The Augmint multiprocessor simulation toolkit for Intel x86 architectures
ICCD '96 Proceedings of the 1996 International Conference on Computer Design, VLSI in Computers and Processors
Design of On-chip and Off-chip Interfaces for a GALS NoC Architecture
ASYNC '06 Proceedings of the 12th IEEE International Symposium on Asynchronous Circuits and Systems
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs
IEEE Transactions on Computers
HARD: Hardware-Assisted Lockset-based Race Detection
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
NUDA: a non-uniform debugging architecture and non-intrusive race detection for many-core
Proceedings of the 46th Annual Design Automation Conference
Hi-index | 0.00 |
Multicore environments are rapidly emerging and are widely used in SoC, but accompanying parallelism programming and debugging impact the ordinary sequential world. Unfortunately, according to Heisenberg's uncertainty principle, the instrument trying to probe the target will cause probe effects. Therefore, current intrusive debugging methodologies for sequential programs cannot be used directly in parallel programs in a multicore environment. This work developed a non-intrusive run-time assertion (RunAssert) for parallel program development based on a novel non-uniform debugging architecture. Our approaches are as follows: (a) a current language extension for parallel program debugging (b) corresponding non-intrusive hardware configuration logic and checking methodologies and (c) several reality cases using the extensions mentioned above. In general, the target program can be executed at its original speed without altering the parallel sequences, thereby eliminating the possibility of probe effect. The net hardware cost is relatively low, the reconfigurable logic for RunAssert is 0.6%-2.5% in a NUDA cluster with 8 cores, such that RunAssert can readily scale up for increasingly complex multicore systems.