MULTILISP: a language for concurrent symbolic computation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic translation of FORTRAN programs to vector form
ACM Transactions on Programming Languages and Systems (TOPLAS)
Fine-grained mobility in the Emerald system
ACM Transactions on Computer Systems (TOCS)
An overview for the PTRAN analysis system for multiprocessing
Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
Detecting conflicts between structure accesses
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Restructuring Lisp programs for concurrent execution
PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Adaptive bitonic sorting: an optimal parallel algorithm for shared-memory machines
SIAM Journal on Computing
Process decomposition through locality of reference
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Experience with CST: programming and implementation
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
The Amber system: parallel programming on a network of multiprocessors
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Introduction to algorithms
Parallelizing programs with recursive data structures
Parallelizing programs with recursive data structures
Compiling Lisp programs for parallel execution
Lisp and Symbolic Computation
Virtual memory primitives for user programs
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Compiler optimizations for Fortran D on MIMD distributed-memory machines
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Compiling programs for nonshared memory machines
Compiling programs for nonshared memory machines
Orca: A Language for Parallel Programming of Distributed Systems
IEEE Transactions on Software Engineering
Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Global optimizations for parallelism and locality on scalable parallel machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Communication optimization and code generation for distributed memory machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Computation migration: enhancing locality for distributed-memory parallel systems
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Decentralized optimal power pricing: the development of a parallel program
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Parallel programming in Split-C
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Fine-grain access control for distributed shared memory
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
A model and stack implementation of multiple environments
Communications of the ACM
Distributed data structures in Linda
POPL '86 Proceedings of the 13th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Para-functional programming: a paradigm for programming multiprocessor systems
POPL '86 Proceedings of the 13th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Optimizing Supercompilers for Supercomputers
Optimizing Supercompilers for Supercomputers
A Retargetable C Compiler: Design and Implementation
A Retargetable C Compiler: Design and Implementation
Parallelizing Programs with Recursive Data Structures
IEEE Transactions on Parallel and Distributed Systems
Lazy Task Creation: A Technique for Increasing the Granularity of Parallel Programs
IEEE Transactions on Parallel and Distributed Systems
Supporting SPMD Execution for Dynamic Data Structures
Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Cid: A Parallel, "Shared-Memory" C for Distributed-Memory Machines
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
The concert system--compiler and runtime support for efficient, fine-grained concurrent object-oriented programs
Restructuring symbolic programs for concurrent execution on multiprocessors
Restructuring symbolic programs for concurrent execution on multiprocessors
Software caching and computation migration in Olden
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
A hybrid execution model for fine-grained languages on distributed memory multicomputers
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Compiler-based prefetching for recursive data structures
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Integrating task and data parallelism using shared objects
ICS '96 Proceedings of the 10th international conference on Supercomputing
Is it a tree, a DAG, or a cyclic graph? A shape analysis for heap-directed pointers in C
POPL '96 Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Space-efficient implementation of nested parallelism
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Predicting data cache misses in non-numeric applications through correlation profiling
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Putting pointer analysis to work
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Communication optimizations for parallel C programs
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Dependence based prefetching for linked data structures
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Automatic Compiler-Inserted Prefetching for Pointer-Based Applications
IEEE Transactions on Computers - Special issue on cache memory and related problems
Locality Analysis for Parallel C Programs
IEEE Transactions on Parallel and Distributed Systems
Effective jump-pointer prefetching for linked data structures
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
StackThreads/MP: integrating futures into calling standards
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Cache-conscious structure layout
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Space-efficient scheduling of nested parallelism
ACM Transactions on Programming Languages and Systems (TOPLAS)
Ace: a language for parallel programming with customizable protocols
ACM Transactions on Computer Systems (TOCS)
Type systems for distributed data structures
Proceedings of the 27th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Push vs. pull: data movement for linked data structures
Proceedings of the 14th international conference on Supercomputing
Automatic compiler techniques for thread coarsening for multithreaded architectures
Proceedings of the 14th international conference on Supercomputing
IEEE Transactions on Computers
Proceedings of the 27th annual international symposium on Computer architecture
Automated data-member layout of heap objects to improve memory-hierarchy performance
ACM Transactions on Programming Languages and Systems (TOPLAS)
PipeRench implementation of the instruction path coprocessor
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Dynamic computation migration in DSM systems
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Simulation of the 3 dimensional cascade flow with numerical wind tunnel (NWT)
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
ICS '01 Proceedings of the 15th international conference on Supercomputing
Dynamically allocating processor resources between nearby and distant ILP
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Source-level global optimizations for fine-grain distributed shared memory systems
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
SPMD execution in the presence of dynamic data structures
Compiler optimizations for scalable parallel systems
Supporting dynamic data structures with Olden
Compiler optimizations for scalable parallel systems
Run-time power estimation in high performance microprocessors
ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Benchmark health considered harmful
ACM SIGARCH Computer Architecture News
OOPSLA '01 Proceedings of the 16th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
A hierarchical load-balancing framework for dynamic multithreaded computations
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Reducing the complexity of the register file in dynamic superscalar processors
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Using the Cowichan Problems to Assess the Usability of Orca
IEEE Parallel & Distributed Technology: Systems & Technology
Data collection and restoration for heterogenenous process migration
Software—Practice & Experience
Optimizing COOP Languages: Study of a Protein Dynamics Program
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Maintaining Spatial Data Sets in Distributed-Memory Machines
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
A Programmable Memory Hierarchy for Prefetching Linked Data Structures
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Content-Based Prefetching: Initial Results
IMS '00 Revised Papers from the Second International Workshop on Intelligent Memory Systems
Executing multiple pipelined data analysis operations in the grid
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Continuous program optimization: A case study
ACM Transactions on Programming Languages and Systems (TOPLAS)
Programming the FlexRAM parallel intelligent memory system
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
A compiler approach for reducing data cache energy
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
On the Interaction of Mobile Processes and Objects
HCW '98 Proceedings of the Seventh Heterogeneous Computing Workshop
Memory Space Representation for Heterogeneous Network Process Migration
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Mark-copy: fast copying GC with less space overhead
OOPSLA '03 Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
A Framework to Capture Dynamic Data Structures in Pointer-Based Codes
IEEE Transactions on Parallel and Distributed Systems
Parallel functional programming on recursively defined data via data-parallel recursion
Journal of Functional Programming
Algorithm + strategy = parallelism
Journal of Functional Programming
ACM Transactions on Computer Systems (TOCS)
Proceedings of the 31st annual international symposium on Computer architecture
A study of source-level compiler algorithms for automatic construction of pre-execution code
ACM Transactions on Computer Systems (TOCS)
Tolerating memory latency through push prefetching for pointer-intensive applications
ACM Transactions on Architecture and Code Optimization (TACO)
Automatic pool allocation: improving performance by controlling data structure layout in the heap
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
PARE: a power-aware hardware data prefetching engine
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Reducing data cache leakage energy using a compiler-based approach
ACM Transactions on Embedded Computing Systems (TECS)
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
MEDEA '04 Proceedings of the 2004 workshop on MEmory performance: DEaling with Applications , systems and architecture
Transparent pointer compression for linked data structures
Proceedings of the 2005 workshop on Memory system performance
On the parallelization of irregular and dynamic programs
Parallel Computing
On the Prediction of Java Object Lifetimes
IEEE Transactions on Computers
Data prefetching in a cache hierarchy with high bandwidth and capacity
MEDEA '06 Proceedings of the 2006 workshop on MEmory performance: DEaling with Applications, systems and architectures
The DaCapo benchmarks: java benchmarking development and analysis
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
ACM Transactions on Programming Languages and Systems (TOPLAS)
IEEE Transactions on Computers
Interprocedural definition-use chains of dynamic pointer-linked data structures
Scientific Programming
Manticore: a heterogeneous parallel language
Proceedings of the 2007 workshop on Declarative aspects of multicore programming
Data prefetching in a cache hierarchy with high bandwidth and capacity
ACM SIGARCH Computer Architecture News
Hardbound: architectural support for spatial safety of the C programming language
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Samurai: protecting critical data in unsafe languages
Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
Memory performance attacks: denial of memory service in multi-core systems
SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Journal of Embedded Computing - Embeded Processors and Systems: Architectural Issues and Solutions for Emerging Applications
MPADS: memory-pooling-assisted data splitting
Proceedings of the 7th international symposium on Memory management
A scheduling framework for general-purpose parallel languages
Proceedings of the 13th ACM SIGPLAN international conference on Functional programming
Global trees: a framework for linked data structures on distributed memory parallel systems
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Abstracting access patterns of dynamic memory using regular expressions
ACM Transactions on Architecture and Code Optimization (TACO)
SoftBound: highly compatible and complete spatial memory safety for c
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
ECMon: exposing cache events for monitoring
Proceedings of the 36th annual international symposium on Computer architecture
Memory management thread for heap allocation intensive sequential applications
Proceedings of the 10th workshop on MEmory performance: DEaling with Applications, systems and architecture
Layout transformations for heap objects using static access patterns
CC'07 Proceedings of the 16th international conference on Compiler construction
An utilization driven framework for energy efficient caches
HiPC'08 Proceedings of the 15th international conference on High performance computing
TPCC-UVa: an open-source TPC-C implementation for parallel and distributed systems
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Energy-efficient hardware data prefetching
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Multicore performance optimization using partner cores
HotPar'11 Proceedings of the 3rd USENIX conference on Hot topic in parallelism
Automatically generating symbolic prefetches for distributed transactional memories
Proceedings of the ACM/IFIP/USENIX 11th International Conference on Middleware
Proceedings of the International Conference on Computer-Aided Design
Runtime biased pointer reuse analysis and its application to energy efficiency
PACS'03 Proceedings of the Third international conference on Power - Aware Computer Systems
Energy-aware data prefetching for general-purpose programs
PACS'04 Proceedings of the 4th international conference on Power-Aware Computer Systems
MemSafe: ensuring the spatial and temporal memory safety of C at runtime
Software—Practice & Experience
Data Linkage Algebra, Data Linkage Dynamics, and Priority Rewriting
Fundamenta Informaticae
Hi-index | 0.01 |
Compiling for distributed-memory machines has been a very active research area in recent years. Much of this work has concentrated on programs that use arrays as their primary data structures. To date, little work has been done to address the problem of supporting programs that use pointer-based dynamic data structures. The techniques developed for supporting SPMD execution of array-based programs rely on the fact that arrays are statically defined and directly addressable. Recursive data structures do not have these properties, so new techniques must be developed. In this article, we describe an execution model for supporting programs that use pointer-based dynamic data structures. This model uses a simple mechanism for migrating a thread of control based on the layout of heap-allocated data and introduces parallelism using a technique based on futures and lazy task creation. We intend to exploit this execution model using compiler analyses and automatic parallelization techniques. We have implemented a prototype system, which we call Olden, that runs on the Intel iPSC/860 and the Thinking Machines CM-5. We discuss our implementation and report on experiments with five benchmarks.