The fuzzy barrier: a mechanism for high speed synchronization of processors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The design, implementation and evaluation of Jade: a portable, implicitly parallel programming language
ACM Transactions on Computer Systems (TOCS)
SAS System for Regression
Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs
IEEE Transactions on Parallel and Distributed Systems
Exploring the acceptability envelope
OOPSLA '05 Companion to the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Probabilistic accuracy bounds for fault-tolerant computations that discard tasks
Proceedings of the 20th annual international conference on Supercomputing
Chemical Applications of Molecular Modeling
Chemical Applications of Molecular Modeling
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Transparent fault tolerance for parallel applications on networks of workstations
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Architectural and compiler support for strongly atomic transactional memory
Architectural and compiler support for strongly atomic transactional memory
Green: a framework for supporting energy-conscious programming using controlled approximation
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Automatically identifying critical input regions and code in applications
Proceedings of the 19th international symposium on Software testing and analysis
Patterns and statistical analysis for understanding reduced resource computing
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Proceedings of the 20th ACM SIGPLAN workshop on Partial evaluation and program manipulation
Dynamic knobs for responsive power-aware computing
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Managing performance vs. accuracy trade-offs with loop perforation
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Probabilistically accurate program transformations
SAS'11 Proceedings of the 18th international conference on Static analysis
Randomized accuracy-aware program transformations for efficient approximate computations
POPL '12 Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Language and compiler support for auto-tuning variable-accuracy algorithms
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Obtaining and reasoning about good enough software
Proceedings of the 49th Annual Design Automation Conference
Proving acceptability properties of relaxed nondeterministic approximate programs
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Bolt: on-demand infinite loop escape in unmodified binaries
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Proceedings of the 2012 ACM workshop on Relaxing synchronization for multicore and manycore scalability
Verified integrity properties for safe approximate program transformations
PEPM '13 Proceedings of the ACM SIGPLAN 2013 workshop on Partial evaluation and program manipulation
Parallelizing Sequential Programs with Statistical Accuracy Tests
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on Probabilistic Embedded Computing
Verifying quantitative reliability for programs that execute on unreliable hardware
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
SAGE: self-tuning approximation for graphics engines
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Paraprox: pattern-based approximation for data parallel applications
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
We present a new technique, early phase termination, for eliminating idle processors in parallel computations that use barrier synchronization. This technique simply terminates each parallel phaseas soon as there are too few remaining tasks to keep all of the processors busy. Although this technique completely eliminates the idling that would other wise occur at barrier synchronization points, it may also change the computation and therefore the result that the computation produces. We address this issue by providing probabilistic distortion models that characterize how the use of early phase termination distorts the result that the computation produces. Our experimental results show that for our set of benchmark applications, 1) early phase termination can improve the performance of the parallel computation, 2) the distortion is small (or can be made to be small with the use of an appropriate compensation technique) and 3) the distortion models provide accurate and tight distortion bounds. These bounds can enable users to evaluate the effect of early phase termination and confidently accept results from parallel computations that use this technique if they find the distortion bounds to be acceptable. Finally, we identify a general computational pattern that works well with early phase termination and explain why computations that exhibit this pattern can tolerate the early termination of parallel tasks without producing unacceptable results.