ATUM: a new technique for capturing address traces using microcode
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Memory access patterns of parallel scientific programs
SIGMETRICS '87 Proceedings of the 1987 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Monit: a performance monitoring tool for parallel and pseudo-parallel programs
SIGMETRICS '87 Proceedings of the 1987 ACM SIGMETRICS conference on Measurement and modeling of computer systems
VLSI assist for a multiprocessor
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Cheap hardware support for software debugging and profiling
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Firefly: a multiprocessor workstation
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Portable programs for parallel processors
Portable programs for parallel processors
Memory-reference characteristics of multiprocessor applications under MACH
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Performance of the VAX-11/780 translation buffer: simulation and measurement
ACM Transactions on Computer Systems (TOCS)
LocusRoute: a parallel global router for standard cells
DAC '88 Proceedings of the 25th ACM/IEEE Design Automation Conference
Cache evaluation and the impact of workload choice
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
A technique for reducing synchronization overhead in large scale multiprocessors
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
ACM Computing Surveys (CSUR)
Cache Performance in the VAX-11/780
ACM Transactions on Computer Systems (TOCS)
Coordinating parallel processors: a partial unification
ACM SIGARCH Computer Architecture News
Algorithms for scalable synchronization on shared-memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
Software and hardware parallelism on the iWarp multi-computer
ICS '91 Proceedings of the 5th international conference on Supercomputing
Scalable reader-writer synchronization for shared-memory multiprocessors
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
A bibliography of parallel debuggers, 1990 edition
ACM SIGPLAN Notices
Characterizing memory hot spots in a shared memory MIMD machine
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Waiting algorithms for synchronization in large-scale multiprocessors
ACM Transactions on Computer Systems (TOCS)
Performance debugging using parallel performance predicates
PADD '93 Proceedings of the 1993 ACM/ONR workshop on Parallel and distributed debugging
Waiting time analysis and performance visualization in Carnival
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Mtool: An Integrated System for Performance Debugging Shared Memory Multiprocessor Applications
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
Contention for synchronization locks and delays waiting for synchronization events can substantially increase the running time of a parallel program. This makes it important to characterize the synchronization behavior of programs and to provide analysis tools to aid both the hardware and software designer in evaluating design alternatives. This paper describes a tracing facility that is incorporated into a synchronization package. This facility provides a portable means to accurately and efficiently characterize parallel programs. The behavior of several applications has been monitored uncovering program characteristics that make it difficult to achieve linear speedup. Our monitoring facility allows a programmer to determine the performance implications of the synchronization structure he has used, and it allows the architect to evaluate various hardware support mechanisms.