Vector models for data-parallel computing
Vector models for data-parallel computing
Algorithms in C++
Parallel programming in Split-C
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
HPC++: experiments with the parallel standard template library
ICS '97 Proceedings of the 11th international conference on Supercomputing
Thread scheduling for multiprogrammed multiprocessors
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
STL tutorial and reference guide, second edition: C++ programming with the standard template library
STL tutorial and reference guide, second edition: C++ programming with the standard template library
Hoard: a scalable memory allocator for multithreaded applications
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
The C++ Programming Language, Third Edition
The C++ Programming Language, Third Edition
Parallel Programming Using C++
Parallel Programming Using C++
Dynamic Storage Allocation: A Survey and Critical Review
IWMM '95 Proceedings of the International Workshop on Memory Management
Scalability Analysis of Multidimensional Wavefront Algorithms on Large-Scale SMP Clusters
FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
Predicting Performance on SMPs. A Case Study: The SGI Power Challenge
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
NESL: A Nested Data-Parallel Language (Version 2.6)
NESL: A Nested Data-Parallel Language (Version 2.6)
A Cost Model for Communication on a Symmetric MultiProcessor
A Cost Model for Communication on a Symmetric MultiProcessor
Support for parallel generic programming
Support for parallel generic programming
A framework for adaptive algorithm selection in STAPL
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Associated types and constraint propagation for mainstream object-oriented generics
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Glift: Generic, efficient, random-access GPU data structures
ACM Transactions on Graphics (TOG)
An Adaptive Algorithm Selection Framework for Reduction Parallelization
IEEE Transactions on Parallel and Distributed Systems
MCSTL: the multi-core standard template library
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Evolving a language in and for the real world: C++ 1991-2006
Proceedings of the third ACM SIGPLAN conference on History of programming languages
Adaptive loops with kaapi on multicore and grid: applications in symmetric cryptography
Proceedings of the 2007 international workshop on Parallel symbolic computation
Library composition and adaptation using c++ concepts
GPCE '07 Proceedings of the 6th international conference on Generative programming and component engineering
Parallel iterator for parallelising object oriented applications
SEPADS'08 Proceedings of the 7th WSEAS International Conference on Software Engineering, Parallel and Distributed Systems
Parallelization of Generic Libraries Based on Type Properties
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
LCSD '07 Proceedings of the 2007 Symposium on Library-Centric Software Design
Extending Automatic Parallelization to Optimize High-Level Abstractions for Multicore
IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
Multi-target C++ implementation of parallel skeletons
Proceedings of the 8th workshop on Parallel/High-Performance Object-Oriented Scientific Computing
PFunc: modern task parallelism for modern high performance computing
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Source Code Rejuvenation Is Not Refactoring
SOFSEM '10 Proceedings of the 36th Conference on Current Trends in Theory and Practice of Computer Science
Design and use of htalib: a library for hierarchically tiled arrays
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Science of Computer Programming
Parallelization of bulk operations for STL dictionaries
Euro-Par'07 Proceedings of the 2007 conference on Parallel processing
STAPL: standard template adaptive parallel library
Proceedings of the 3rd Annual Haifa Experimental Systems Conference
Hybrid bulk synchronous parallelism library for clustered smp architectures
Proceedings of the fourth international workshop on High-level parallel programming and applications
A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers
Software—Practice & Experience - Focus on Selected PhD Literature Reviews in the Practical Aspects of Software Technology
Hierarchically tiled arrays for parallelism and locality
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Algorithm engineering: bridging the gap between algorithm theory and practice
Algorithm engineering: bridging the gap between algorithm theory and practice
Efficient run-time dispatching in generic programming with minimal code bloat
Science of Computer Programming
Support for the evolution of C++ generic functions
SLE'10 Proceedings of the Third international conference on Software language engineering
The tao of parallelism in algorithms
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
ALTER: exploiting breakable dependences for parallelization
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Lock-free dynamically resizable arrays
OPODIS'06 Proceedings of the 10th international conference on Principles of Distributed Systems
Supporting SELL for high-performance computing
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Introducing ScaleGraph: an X10 library for billion scale graph analytics
Proceedings of the 2012 ACM SIGPLAN X10 Workshop
Processor allocation for optimistic parallelization of irregular programs
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part I
Avalanche: a fine-grained flow graph model for irregular applications on distributed-memory systems
Proceedings of the 1st ACM SIGPLAN workshop on Functional high-performance computing
MCSTL: the multi-core standard template library
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Parallelism granules aggregation with the T-system
PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
Deterministic scale-free pipeline parallelism with hyperqueues
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
The Standard Template Adaptive Parallel Library (STAPL) is a parallel library designed as a superset of the ANSI C++ Standard Template Library (STL). It is sequentially consistent for functions with the same name, and executes on uni- or multi-processor systems that utilize shared or distributed memory. STAPL is implemented using simple parallel extensions of C++ that currently provide a SPMD model of parallelism, and supports nested parallelism. The library is intended to be general purpose, but emphasizes irregular programs to allow the exploitation of parallelism for applications which use dynamically linked data structures such as particle transport calculations, molecular dynamics, geometric modeling, and graph algorithms. STAPL provides several different algorithms for some library routines, and selects among them adaptively at runtime. STAPL can replace STL automatically by invoking a preprocessing translation phase. In the applications studied, the performance of translated code was within 5% of the results obtained using STAPL directly. STAPL also provides functionality to allow the user to further optimize the code and achieve additional performance gains. We present results obtained using STAPL for a molecular dynamics code and a particle transport code.