MULTILISP: a language for concurrent symbolic computation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Communications of the ACM - Special issue on parallelism
DIB—a distributed implementation of backtracking
ACM Transactions on Programming Languages and Systems (TOPLAS)
Workcrews: an abstraction for controlling parallelism
International Journal of Parallel Programming
Mul-T: a high-performance parallel Lisp
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
The Amber system: parallel programming on a network of multiprocessors
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
PVM: a framework for parallel distributed computing
Concurrency: Practice and Experience
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A simple load balancing scheme for task allocation in parallel machines
SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
Parallel algorithms for shared-memory machines
Handbook of theoretical computer science (vol. A)
Scheduler activations: effective kernel support for the user-level management of parallelism
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Communication complexity for parallel divide-and-conquer
SFCS '91 Proceedings of the 32nd annual symposium on Foundations of computer science
The network architecture of the Connection Machine CM-5 (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
A customizable substrate for concurrent languages
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Computation migration: enhancing locality for distributed-memory parallel systems
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
An atomic model for message-passing
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Randomized parallel algorithms for backtrack search and branch-and-bound computation
Journal of the ACM (JACM)
Studying overheads in massively parallel MIN/MAX-tree evaluation
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Synchronized MIMD computing
The Parallel Evaluation of General Arithmetic Expressions
Journal of the ACM (JACM)
Lazy Task Creation: A Technique for Increasing the Granularity of Parallel Programs
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
A Multithreaded Implementation of Id using P-RISC Graphs
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Cid: A Parallel, "Shared-Memory" C for Distributed-Memory Machines
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Executing functional programs on a virtual tree of processors
FPCA '81 Proceedings of the 1981 conference on Functional programming languages and computer architecture
Optimistic active messages: a mechanism for scheduling communication with computation
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient support of location transparency in concurrent object-oriented programming languages
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
A hybrid execution model for fine-grained languages on distributed memory multicomputers
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
ICS '95 Proceedings of the 9th international conference on Supercomputing
pHluid: the design of a parallel functional language implementation on workstations
Proceedings of the first ACM SIGPLAN international conference on Functional programming
Thread scheduling for cache locality
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
An analysis of dag-consistent distributed shared-memory algorithms
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Efficient detection of determinacy races in Cilk programs
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Space-efficient scheduling of parallelism with synchronization variables
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Space-efficient implementation of nested parallelism
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Space and time efficient execution of parallel irregular computations
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Experiences with non-numeric applications on multithreaded architectures
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Fast set operations using treaps
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Analyses of load stealing models based on differential equations
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Computation-centric memory models
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Detecting data races in Cilk programs that use locks
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Performance measurements for multithreaded programs
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Retrospective: Monsoon: an explicit token-store architecture
25 years of the international symposia on Computer architecture (selected papers)
The design, implementation, and evaluation of Jade
ACM Transactions on Programming Languages and Systems (TOPLAS)
Space/time-efficient scheduling and execution of parallel irregular computations
ACM Transactions on Programming Languages and Systems (TOPLAS)
StackThreads/MP: integrating futures into calling standards
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Automatic parallelization of divide and conquer algorithms
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Cluster I/O with River: making the fast case common
Proceedings of the sixth workshop on I/O in parallel and distributed systems
Provably efficient scheduling for languages with fine-grained parallelism
Journal of the ACM (JACM)
Javelin++: scalability issues in global computing
JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
SMARTS: exploiting temporal locality and parallelism through vertical execution
ICS '99 Proceedings of the 13th international conference on Supercomputing
Recursive array layouts and fast parallel matrix multiplication
Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
Space-efficient scheduling of nested parallelism
ACM Transactions on Programming Languages and Systems (TOPLAS)
A provably time-efficient parallel implementation of full speculation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic compiler techniques for thread coarsening for multithreaded architectures
Proceedings of the 14th international conference on Supercomputing
PADS '00 Proceedings of the fourteenth workshop on Parallel and distributed simulation
A Programming Methodology for Dual-Tier Multicomputers
IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools for parallel processing
System architecture directions for networked sensors
ACM SIGPLAN Notices
A scalable, robust network for parallel computing
Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande
System architecture directions for networked sensors
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Efficient load balancing for wide-area divide-and-conquer applications
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Supporting dynamic data structures with Olden
Compiler optimizations for scalable parallel systems
ATLAS: an infrastructure for global computing
EW 7 Proceedings of the 7th workshop on ACM SIGOPS European workshop: Systems support for worldwide applications
Scheduled Dataflow: Execution Paradigm, Architecture, and Performance Evaluation
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Fine-Grained Multithreading with Process Calculi
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
A hierarchical load-balancing framework for dynamic multithreaded computations
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Evaluating the performance limitations of MPMD communication
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Advanced eager scheduling for Java-based adaptively parallel computing
JGI '02 Proceedings of the 2002 joint ACM-ISCOPE conference on Java Grande
ACM Transactions on Computer Systems (TOCS)
Automatic Parallelization of Recursive Procedures
International Journal of Parallel Programming
The Concurrent Graph: Basic Technology for Irregular Problems
IEEE Parallel & Distributed Technology: Systems & Technology
Programming Languages for CSE: The State of the Art
IEEE Computational Science & Engineering
On Optimal Strategies for Cycle-Stealing in Networks of Workstations
IEEE Transactions on Computers
Recursive Array Layouts and Fast Matrix Multiplication
IEEE Transactions on Parallel and Distributed Systems
High-Performance Scalable Java Virtual Machines
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
Design and Evaluation of a High-Level Interface for Data Mining
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Dag-Consistent Distributed Shared Memory
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Analysis of Several Scheduling Algorithms under the Nano-Thread Programming Model
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Production Job Scheduling for Parallel Shared Memory Systems
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Asynchronous Resource Management
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Javelin 2.0: Java-Based Parallel Computing on the Internet
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Scheduling User-Level Threads on Distributed Shared-Memory Multiprocessors
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Satin: Efficient Parallel Divide-and-Conquer in Java
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
CC '01 Proceedings of the 10th International Conference on Compiler Construction
Expressing Irregular Computations in Modern Fortran Dialects
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Facilitating Parallel Programming in PVM Using Condensed Graphs
Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Dynamic Partitioning of the Divide-and-Conquer Scheme with Migration in PVM Environment
Proceedings of the 8th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Development and Tuning of Irregular Divide-and-Conquer Applications in DAMPVM/DAC
Proceedings of the 9th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
The MOL project: an open, extensible metacomputer
HCW '97 Proceedings of the 6th Heterogeneous Computing Workshop (HCW '97)
Supporting multiple parallel programming paradigms on top of the Millipede virtual parallel machine
HIPS '97 Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
Cache-and-query for wide area sensor databases
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Low Memory Cost Dynamic Scheduling of Large Coarse Grain Task Graphs
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
A Load Balancing Framework for Adaptive and Asynchronous Applications
IEEE Transactions on Parallel and Distributed Systems
A comparative analysis of fine-grain threads packages
Journal of Parallel and Distributed Computing
Parallel program performance prediction using deterministic task graph analysis
ACM Transactions on Computer Systems (TOCS)
Pipelined functional tree accesses and updates: scheduling, synchronization, caching and coherence
Journal of Functional Programming
On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
ATOP-space and time adaptation for parallel and grid applications via flexible data partitioning
ARM '04 Proceedings of the 3rd workshop on Adaptive and reflective middleware
IEEE Transactions on Knowledge and Data Engineering
Adaptive scheduling with parallelism feedback
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Computer
A general approach for partitioning N-dimensional parallel nested loops with conditionals
Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
The cache complexity of multithreaded cache oblivious algorithms
Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
Toward real-time image guided neurosurgery using distributed and grid computing
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
CAPSULE: Hardware-Assisted Parallel Execution of Component-Based Programs
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field programmable gate arrays
Adaptive work stealing with parallelism feedback
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Irregular computations in Fortran - expression and implementation strategies
Scientific Programming
CX: A scalable, robust network for parallel computing
Scientific Programming
Scheduling threads for constructive cache sharing on CMPs
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Carbon: architectural support for fine-grained parallelism on chip multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
Adaptive and reliable parallel computing on networks of workstations
ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
Parallel XML processing by work stealing
Proceedings of the 2007 workshop on Service-oriented computing performance: aspects, issues, and approaches
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Enabling scalability and performance in a large scale CMP environment
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Programming asynchronous layers with CLARITY
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Supporting exception handling for futures in Java
Proceedings of the 5th international symposium on Principles and practice of programming in Java
A portable runtime interface for multi-level memory hierarchies
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Time and space adaptation for computational grids with the ATOP-Grid middleware
Future Generation Computer Systems
Advanced collective communication in aspen
Proceedings of the 22nd annual international conference on Supercomputing
Phasers: a unified deadlock-free construct for collective and point-to-point synchronization
Proceedings of the 22nd annual international conference on Supercomputing
WSPE: a peer-to-peer programming environment for grid-unaware applications
Proceedings of the 5th international workshop on Middleware for grid computing: held at the ACM/IFIP/USENIX 8th International Middleware Conference
Adaptive work-stealing with parallelism feedback
ACM Transactions on Computer Systems (TOCS)
Proceedings of the conference on Design, automation and test in Europe
Proceedings of the conference on Design, automation and test in Europe
Implicitly-threaded parallelism in Manticore
Proceedings of the 13th ACM SIGPLAN international conference on Functional programming
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
A Real-Time Programming Model for Heterogeneous MPSoCs
SAMOS '08 Proceedings of the 8th international workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
Capsules: Expressing Composable Computations in a Parallel Programming Model
Languages and Compilers for Parallel Computing
OpenMP tasks in IBM XL compilers
CASCON '08 Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds
How much parallelism is there in irregular applications?
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient, portable implementation of asynchronous multi-place programs
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Implementation of parallel programs interpreter in the development environment ParJava
Programming and Computing Software
Available task-level parallelism on the Cell BE
Scientific Programming - High Performance Computing with the Cell Broadband Engine
As-if-serial exception handling semantics for Java futures
Science of Computer Programming
Decomposition of Task-Level Concurrency on C Programs Applied to the Design of Multiprocessor SoC
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Celling SHIM: compiling deterministic concurrency to a heterogeneous multicore
Proceedings of the 2009 ACM symposium on Applied Computing
Exploiting Speculative TLP in Recursive Programs by Dynamic Thread Prediction
CC '09 Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Chunking parallel loops in the presence of synchronization
Proceedings of the 23rd international conference on Supercomputing
Load balancing using work-stealing for pipeline parallelism in emerging applications
Proceedings of the 23rd international conference on Supercomputing
Hiding Communication Latency with Non-SPMD, Graph-Based Execution
ICCS '09 Proceedings of the 9th International Conference on Computational Science: Part I
Evaluating OpenMP 3.0 Run Time Systems on Unbalanced Task Graphs
IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
Multicore Scheduling for Lightweight Communicating Processes
COORDINATION '09 Proceedings of the 11th International Conference on Coordination Models and Languages
Dynamic load balancing efficiently in a large-scale cluster
International Journal of High Performance Computing and Networking
Beyond nested parallelism: tight bounds on work-stealing overheads for parallel futures
Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
Asserting and checking determinism for multithreaded programs
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Runtime support for multicore Haskell
Proceedings of the 14th ACM SIGPLAN international conference on Functional programming
Fragmentation of Numerical Algorithms for the Parallel Subroutines Library
PaCT '09 Proceedings of the 10th International Conference on Parallel Computing Technologies
RenderAnts: interactive Reyes rendering on GPUs
ACM SIGGRAPH Asia 2009 papers
Concurrency by default: using permissions to express dataflow in stateful programs
Proceedings of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications
A type and effect system for deterministic parallel Java
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
The design of a task parallel library
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Efficient shared-memory support for parallel graph reduction
Future Generation Computer Systems
Satin: A high-level and efficient grid programming model
ACM Transactions on Programming Languages and Systems (TOPLAS)
Thread migration in a parallel graph reducer
IFL'02 Proceedings of the 14th international conference on Implementation of functional languages
Scheduling dynamically spawned processes in MPI-2
JSSPP'06 Proceedings of the 12th international conference on Job scheduling strategies for parallel processing
Mapping unstructured applications into nested parallelism
VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science
Parallelising symbolic state-space generators
CAV'07 Proceedings of the 19th international conference on Computer aided verification
An adaptive task creation strategy for work-stealing scheduling
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
A container-iterator parallel programming model
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Composing parallel software efficiently with lithe
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Data marshaling for multi-core architectures
Proceedings of the 37th annual international symposium on Computer architecture
Computer-aided construction of concurrent systems
Proceedings of the 11th International Conference on Computer Systems and Technologies and Workshop for PhD Students in Computing on International Conference on Computer Systems and Technologies
Granularity-Aware Work-Stealing for Computationally-Uniform Grids
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
AzureBlast: a case study of developing science applications on the cloud
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Scalable hardware support for conditional parallelization
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Reducing task creation and termination overhead in explicitly parallel programs
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Lithe: enabling efficient composition of parallel libraries
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
New abstractions for data parallel programming
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Implicit invocation meets safe, implicit concurrency
GPCE '10 Proceedings of the ninth international conference on Generative programming and component engineering
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Back to the futures: incremental parallelization of existing sequential runtime systems
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Concurrency by modularity: design patterns, a case in point
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion
Automatic verification of determinism for structured parallel programs
SAS'10 Proceedings of the 17th international conference on Static analysis
Building scalable software systems in the multicore era
Proceedings of the FSE/SDP workshop on Future of software engineering research
Self-replicating objects for multicore platforms
ECOOP'10 Proceedings of the 24th European conference on Object-oriented programming
Area-maximizing schedules for series-parallel DAGs
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Starsscheck: a tool to find errors in task-based parallel programs
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
On the definition of service abstractions for parallel computing
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part II
Parallel SAH k-D tree construction
Proceedings of the Conference on High Performance Graphics
Piccolo: building fast, distributed programs with partitioned tables
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Safe nondeterminism in a deterministic-by-default parallel language
Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Optimization of parallel execution of numerical programs in LuNA fragmented programming system
MTPP'10 Proceedings of the Second Russia-Taiwan conference on Methods and tools of parallel programming multicomputers
SD3: A Scalable Approach to Dynamic Data-Dependence Profiling
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Task Superscalar: An Out-of-Order Task Pipeline
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Comparing the usability of library vs. language approaches to task parallelism
Evaluation and Usability of Programming Languages and Tools
Efficient data race detection for async-finish parallelism
RV'10 Proceedings of the First international conference on Runtime verification
Programming the memory hierarchy revisited: supporting irregular parallelism in sequoia
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
A domain-specific approach to heterogeneous parallelism
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Synthesizing concurrent schedulers for irregular algorithms
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Frameworks for multi-core architectures: a comprehensive evaluation using 2D/3D image registration
ARCS'11 Proceedings of the 24th international conference on Architecture of computing systems
Implicitly threaded parallelism in manticore
Journal of Functional Programming
A multithreaded multicore system for embedded media processing
Transactions on high-performance embedded architectures and compilers III
Scheduling task parallelism on multi-socket multicore systems
Proceedings of the 1st International Workshop on Runtime and Operating Systems for Supercomputers
A programming model for deterministic task parallelism
Proceedings of the 2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Garbage collection auto-tuning for Java mapreduce on multi-cores
Proceedings of the international symposium on Memory management
The tao of parallelism in algorithms
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Parallelism orchestration using DoPE: the degree of parallelism executive
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Parallel and distributed programming extensions for mainstream languages based on pi-calculus
Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing
Balance principles for algorithm-architecture co-design
HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
Parallelizing the merge sorting network algorithm on a multi-core computer using Go and Cilk
Proceedings of the 49th Annual Southeast Regional Conference
Multicore programming in ParaSail: parallel specification and implementation language
Ada-Europe'11 Proceedings of the 16th Ada-Europe international conference on Reliable software technologies
Resource-agnostic programming for many-core microgrids
Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
Productive cluster programming with OmpSs
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
A monad for deterministic parallelism
Proceedings of the 4th ACM symposium on Haskell
Fragmentation of numerical algorithms for parallel subroutines library
The Journal of Supercomputing
LuNA fragmented programming system, main functions and peculiarities of run-time subsystem
PaCT'11 Proceedings of the 11th international conference on Parallel computing technologies
Oracle scheduling: controlling granularity in implicitly parallel languages
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
On the simulation of large-scale architectures using multiple application abstraction levels
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Toward high-throughput algorithms on many-core architectures
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Generating synchronization statements in divide-and-conquer programs
Parallel Computing
Proceedings of the compilation of the co-located workshops on DSM'11, TMC'11, AGERE!'11, AOOPES'11, NEAT'11, & VMIL'11
Intermediate language extensions for parallelism
Proceedings of the compilation of the co-located workshops on DSM'11, TMC'11, AGERE!'11, AOOPES'11, NEAT'11, & VMIL'11
Scheduling efficiently for irregular load distributions in a large-scale cluster
ISPA'05 Proceedings of the Third international conference on Parallel and Distributed Processing and Applications
An efficient dynamic load-balancing algorithm in a large-scale cluster
ICA3PP'05 Proceedings of the 6th international conference on Algorithms and Architectures for Parallel Processing
SPECTRE: speculation to hide communication latency
Proceedings of the Second Asia-Pacific Workshop on Systems
Factory: an object-oriented parallel programming substrate for deep multiprocessors
HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
Function flow: making synchronization easier in task parallelism
Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
A-FAST: autonomous flow approach to scheduling tasks
HiPC'04 Proceedings of the 11th international conference on High Performance Computing
Group-Based scheduling scheme for result checking in global computing systems
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part III
Introducing parallelism and concurrency in the data structures course
Proceedings of the 43rd ACM technical symposium on Computer Science Education
Proceedings of the 2012 workshop on Modularity in Systems Software
BWS: balanced work stealing for time-sharing multicores
Proceedings of the 7th ACM european conference on Computer Systems
Multicore scheduling for lightweight communicating processes
Science of Computer Programming
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Massively parallel constraint programming for supercomputers: challenges and initial results
CPAIOR'10 Proceedings of the 7th international conference on Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems
Balancing Programmability and Silicon Efficiency of Heterogeneous Multicore Architectures
ACM Transactions on Embedded Computing Systems (TECS)
Empirical Software Engineering and Verification
Towards a codelet-based runtime for exascale computing: position paper
Proceedings of the 2nd International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
Work stealing strategies for parallel stream processing in soft real-time systems
ARCS'12 Proceedings of the 25th international conference on Architecture of Computing Systems
Improving performance of adaptive component-based dataflow middleware
Parallel Computing
OpenMP task scheduling strategies for multicore NUMA systems
International Journal of High Performance Computing Applications
Revisiting the cache miss analysis of multithreaded algorithms
LATIN'12 Proceedings of the 10th Latin American international conference on Theoretical Informatics
Mapping a data-flow programming model onto heterogeneous platforms
Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
Scalable and precise dynamic datarace detection for structured parallelism
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
The myrmics memory allocator: hierarchical,message-passing allocation for global address spaces
Proceedings of the 2012 international symposium on Memory Management
Automatic inference of memory fences
ACM SIGACT News
Yada: Straightforward parallel programming
Parallel Computing
Work stealing and persistence-based load balancers for iterative overdecomposed applications
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Apricot: an optimizing compiler and productivity tool for x86-compatible many-core coprocessors
Proceedings of the 26th ACM international conference on Supercomputing
Concurrency and Computation: Practice & Experience
Integrating data-intensive cloud computing with multicores and clusters in an HPC course
Proceedings of the 17th ACM annual conference on Innovation and technology in computer science education
Task-level data model for hardware synthesis based on concurrent collections
Journal of Electrical and Computer Engineering - Special issue on ESL Design Methodology
The Journal of Supercomputing
Scalable parallel interval propagation for sparse constraint satisfaction problems
PSI'11 Proceedings of the 8th international conference on Perspectives of System Informatics
Operating systems should manage accelerators
HotPar'12 Proceedings of the 4th USENIX conference on Hot Topics in Parallelism
Disciplined concurrent programming using tasks with effects
HotPar'12 Proceedings of the 4th USENIX conference on Hot Topics in Parallelism
Performance analysis of SCOOP programs
Journal of Systems and Software
Avalanche: a fine-grained flow graph model for irregular applications on distributed-memory systems
Proceedings of the 1st ACM SIGPLAN workshop on Functional high-performance computing
A meta-scheduler for the par-monad: composable scheduling for the heterogeneous cloud
Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
Performance study of matrix computations using multi-core programming tools
Proceedings of the Fifth Balkan Conference in Informatics
How to achieve scalable fork/join on many-core architectures?
Proceedings of the 3rd annual conference on Systems, programming, and applications: software for humanity
High throughput software for direct numerical simulations of compressible two-phase flows
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Legion: expressing locality and independence with logical regions
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Designing a unified programming model for heterogeneous machines
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
NUMA-aware graph mining techniques for performance and energy efficiency
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Efficient data race detection for async-finish parallelism
Formal Methods in System Design
Parallel schedule synthesis for attribute grammars
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
A Transformation Framework for Optimizing Task-Parallel Programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
DeNovoND: efficient hardware support for disciplined non-determinism
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Computational sprinting on a hardware/software testbed
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
MP-Tomasulo: A Dependency-Aware Automatic Parallel Execution Engine for Sequential Programs
ACM Transactions on Architecture and Code Optimization (TACO)
Steal Tree: low-overhead tracing of work stealing schedulers
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Exploiting domain knowledge to optimize parallel computational mechanics codes
Proceedings of the 27th international ACM conference on International conference on supercomputing
High quality real-time image-to-mesh conversion for finite element simulations
Proceedings of the 27th international ACM conference on International conference on supercomputing
Expressing graph algorithms using generalized active messages
Proceedings of the 27th international ACM conference on International conference on supercomputing
Prefetching and cache management using task lifetimes
Proceedings of the 27th international ACM conference on International conference on supercomputing
Proceedings of the third ACM SIGPLAN X10 Workshop
Transparently consistent asynchronous shared memory
Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
Design and implementation of a customizable work stealing scheduler
Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
Sambamba: runtime adaptive parallel execution
Proceedings of the 3rd International Workshop on Adaptive Self-Tuning Computing Systems
A distributed dynamic load balancer for iterative applications
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Semi-automatic restructuring of offloadable tasks for many-core accelerators
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
EventWave: programming model and runtime support for tightly-coupled elastic cloud applications
Proceedings of the 4th annual Symposium on Cloud Computing
Trellis: Portability across architectures with a high-level framework
Journal of Parallel and Distributed Computing
A catalog of stream processing optimizations
ACM Computing Surveys (CSUR)
Adaptive granularity control in task parallel programs using multiversioning
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Examining the expert gap in parallel programming
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
An implementation of the codelet model
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Deterministic galois: on-demand, portable and parameterless
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Æminium: A Permission-Based Concurrent-by-Default Programming Language Approach
ACM Transactions on Programming Languages and Systems (TOPLAS)
Well-structured futures and cache locality
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Apple-CORE: Harnessing general-purpose many-cores with hardware concurrency management
Microprocessors & Microsystems
High quality real-time Image-to-Mesh conversion for finite element simulations
Journal of Parallel and Distributed Computing
Towards software performance engineering for multicore and manycore systems
ACM SIGMETRICS Performance Evaluation Review
Proceedings of the 5th ACM/SPEC international conference on Performance engineering
Colored Petri Net model with automatic parallelization on real-time multicore architectures
Journal of Systems Architecture: the EUROMICRO Journal
Hi-index | 0.02 |
Cilk (pronounced “silk”) is a C-based runtime system for multi-threaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and analytically. We show that on real and synthetic applications, the “work” and “critical path” of a Cilk computation can be used to accurately model performance. Consequently, a Cilk programmer can focus on reducing the work and critical path of his computation, insulated from load balancing and other runtime scheduling issues. We also prove that for the class of “fully strict” (well-structured) programs, the Cilk scheduler achieves space, time and communication bounds all within a constant factor of optimal.The Cilk runtime system currently runs on the Connection Machine CM5 MPP, the Intel Paragon MPP, the Silicon Graphics Power Challenge SMP, and the MIT Phish network of workstations. Applications written in Cilk include protein folding, graphic rendering, backtrack search, and the *Socrates chess program, which won third prize in the 1994 ACM International Computer Chess Championship.