Cilk: an efficient multithreaded runtime system

Authors:
Robert D. Blumofe;Christopher F. Joerg;Bradley C. Kuszmaul;Charles E. Leiserson;Keith H. Randall;Yuli Zhou
Affiliations:
-;-;-;-;-;-
Venue:
Journal of Parallel and Distributed Computing - Special issue on multithreading for multiprocessors
Year:
1996

Citing 0
Cited 90

Dag consistent parallel simulation: a predictable and robust conservative algorithm

Proceedings of the eleventh workshop on Parallel and distributed simulation
Thread scheduling for multiprogrammed multiprocessors

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Detecting data races in Cilk programs that use locks

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
The implementation of the Cilk-5 multithreaded language

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Nops: a conservative parallel simulation engine for TeD

PADS '98 Proceedings of the twelfth workshop on Parallel and distributed simulation
Scheduling threads for low space requirement and good locality

Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
Scheduling multithreaded computations by work stealing

Journal of the ACM (JACM)
The data locality of work stealing

Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
Symbolic bounds analysis of pointers, array indices, and accessed memory regions

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Pthreads for dynamic and irregular parallelism

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
SDAARC: An Extended Cache-Only Memory Architecture

IEEE Micro
Suboptimal Minimum Cluster Volume Cover-Based Method for Measuring Fractal Dimension

IEEE Transactions on Pattern Analysis and Machine Intelligence
Predicting Scalability of Parallel Garbage Collectors on Shared Memory Multiprocessors

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Using Cohort-Scheduling to Enhance Server Performance

ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
Building a Conservative Parallel Simulation with Existing Component Libraries

LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Instrumentation of Synchronous Reactive Systems for Performance Analysis: A Case Study

TOOLS '98 Proceedings of the 10th International Conference on Computer Performance Evaluation: Modelling Techniques and Tools
High-performance thread migration on clusters of SMPs

Cluster computing
Efficient Fine-Grain Thread Migration with Active Threads

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Capriccio: scalable threads for internet services

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Fault-Tolerance, Malleability and Migration for Divide-and-Conquer Applications on the Grid

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Symbolic bounds analysis of pointers, array indices, and accessed memory regions

ACM Transactions on Programming Languages and Systems (TOPLAS)
Jalapeno: secentralized grid computing using peer-to-peer technology

Proceedings of the 2nd conference on Computing frontiers
Is MPI suitable for a generative design-pattern system?

Parallel Computing - Algorithmic skeletons
A history of Haskell: being lazy with class

Proceedings of the third ACM SIGPLAN conference on History of programming languages
KAAPI: A thread scheduling runtime system for data flow computations on cluster of multi-processors

Proceedings of the 2007 international workshop on Parallel symbolic computation
The co-replication methodology and its application to structured parallel programs

Proceedings of the 2007 symposium on Component and framework technology in high-performance and scientific computing
Quasi-static scheduling for safe futures

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Nested parallelism in transactional memory

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Larrabee: a many-core x86 architecture for visual computing

ACM SIGGRAPH 2008 papers
Adaptive work-stealing with parallelism feedback

ACM Transactions on Computer Systems (TOCS)
Fine Grain Distributed Implementation of a Dataflow Language with Provable Performances

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part II
Pillar: A Parallel Implementation Language

Languages and Compilers for Parallel Computing
Speculative N-Way barriers

Proceedings of the 4th workshop on Declarative aspects of multicore programming
Scala Actors: Unifying thread-based and event-based programming

Theoretical Computer Science
Parallel and distributed local search in COMET

Computers and Operations Research
Steal-on-Abort: Improving Transactional Memory Performance through Dynamic Transaction Reordering

HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Thread criticality predictors for dynamic performance, power, and resource management in chip multiprocessors

Proceedings of the 36th annual international symposium on Computer architecture
Developing parallel programs: A design-oriented perspective

IWMSE '09 Proceedings of the 2009 ICSE Workshop on Multicore Software Engineering
Searching for Concurrent Design Patterns in Video Games

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Grace: safe multithreaded programming for C/C++

Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Exploiting fine-grain thread parallelism on multicore architectures

Scientific Programming - Software Development for Multi-core Computing Systems
An open framework for rapid prototyping of signal processing applications

EURASIP Journal on Embedded Systems - Special issue on design and architectures for signal and image processing
An approach for non-intrusively adding malleable fork/join parallelism into ordinary JavaBean compliant applications

Computer Languages, Systems and Structures
Anahy: a programming environment for cluster computing

VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
Scalable multithreading in a low latency Myrinet cluster

VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science
Brief announcement: serial-parallel reciprocity in dynamic multithreaded languages

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Low depth cache-oblivious algorithms

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers)

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Generic design of Chinese remaindering schemes

Proceedings of the 4th International Workshop on Parallel and Symbolic Computation
Optimizing a parallel runtime system for multicore clusters: a case study

Proceedings of the 2010 TeraGrid Conference
Using memory mapping to support cactus stacks in work-stealing runtime systems

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
A common substrate for cluster computing

HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Scripting the cloud with skywriting

HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
Recursion-driven parallel code generation for multi-core platforms

Proceedings of the Conference on Design, Automation and Test in Europe
Multicore parallelization of min-cost flow for CAD applications

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems - Special section on the ACM IEEE international conference on formal methods and models for codesign (MEMOCODE) 2009
Fast PGAS Implementation of Distributed Graph Algorithms

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Estimating and exploiting potential parallelism by source-level dependence profiling

EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
Porting decision tree algorithms to multicore using fastflow

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Load balancing for regular meshes on SMPs with MPI

EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Challenges and issues of supporting task parallelism in MPI

EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Stable deterministic multithreading through schedule memoization

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
GLOpenCL: OpenCL support on hardware- and software-managed cache multicores

Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Parallelization libraries: Characterizing and reducing overheads

ACM Transactions on Architecture and Code Optimization (TACO)
Robust adaptation to available parallelism in transactional memory applications

Transactions on high-performance embedded architectures and compilers III
Work-stealing for mixed-mode parallelism by deterministic team-building

Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Quantifying the potential task-based dataflow parallelism in MPI applications

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
HOMPI: a hybrid programming framework for expressing and deploying task-based parallelism

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Globally parallel, locally sequential: a preliminary proposal for Acumen objects

Proceedings of the 9th Workshop on Parallel/High-Performance Object-Oriented Scientific Computing
Design of a Multicore Sparse Cholesky Factorization Using DAGs

SIAM Journal on Scientific Computing
Developing java grid applications with ibis

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Internally deterministic parallel algorithms can be fast

Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Transaction reordering to reduce aborts in software transactional memory

Transactions on High-Performance Embedded Architectures and Compilers IV
Impact of over-decomposition on coordinated checkpoint/rollback protocol

Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
CATS: cache aware task-stealing based on online profiling in multi-socket multi-core architectures

Proceedings of the 26th ACM international conference on Supercomputing
LIBKOMP, an efficient openMP runtime system for both fork-join and data flow paradigms

IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
Task-Based execution of nested OpenMP loops

IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
Survey of scheduling techniques for addressing shared resources in multicore processors

ACM Computing Surveys (CSUR)
Persistent fault-tolerance for divide-and-conquer applications on the grid

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Speeding up OpenMP tasking

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Scheduling parallel programs by work stealing with private deques

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
The tasks with effects model for safe concurrency

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Work-stealing with configurable scheduling strategies

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
An efficient programming model for memory-intensive recursive algorithms using parallel disks

Proceedings of the 37th International Symposium on Symbolic and Algebraic Computation
Hardware support for fine-grained event-driven computation in Anton 2

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
A divide and conquer approach and a work-optimal parallel algorithm for the LIS problem

Information Processing Letters
Enabling fine-grained OpenMP tasking on tightly-coupled shared memory clusters

Proceedings of the Conference on Design, Automation and Test in Europe
Session summary: multiprocessor issues, part 1

ACM SIGAda Ada Letters
Unifying refinement and hoare-style reasoning in a logic for higher-order concurrency

Proceedings of the 18th ACM SIGPLAN international conference on Functional programming
DWS: Demand-aware Work-Stealing in Multi-programmed Multi-core Architectures

Proceedings of Programming Models and Applications on Multicores and Manycores
Adaptive workload-aware task scheduling for single-ISA asymmetric multicore architectures

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Cilk: an efficient multithreaded runtime system

Quantified Score

Visualization

Abstract