A safe approximate algorithm for interprocedural aliasing
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side effects
POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
TAM—a compiler controlled threaded abstract machine
Journal of Parallel and Distributed Computing - Special issue on dataflow and multithreaded architectures
Context-sensitive interprocedural points-to analysis in the presence of function pointers
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Supporting dynamic data structures on distributed-memory machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
Efficient context-sensitive pointer analysis for C programs
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Cilk: an efficient multithreaded runtime system
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
A design study of the EARTH multiprocessor
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Is it a tree, a DAG, or a cyclic graph? A shape analysis for heap-directed pointers in C
POPL '96 Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
ICS '90 Proceedings of the 4th international conference on Supercomputing
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Tuning compiler optimizations for simultaneous multithreading
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Compiling C for the EARTH multithreaded architecture
International Journal of Parallel Programming - Special issue: selected papers from PACT'96, fourth international conference on parallel architectures and compilation techniques—part 1
Communication optimizations for parallel C programs
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Pointer analysis for multithreaded programs
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Pointer analysis for programs with structures and casting
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
APRIL: a processor architecture for multiprocessing
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Automatically partitioning threads for multithreaded architectures
Journal of Parallel and Distributed Computing - Special issue on compilation and architectural support for parallel applications
Latency Hiding in Message-Passing Architectures
Proceedings of the 8th International Symposium on Parallel Processing
An Evaluation of Optimized Threaded Code Generation
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
EM-C: Programming with Explicit Parallelism and Locality for EM-4 Multiprocessor
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Detecting Parallelism in C Programs with Recursive Darta Structures
CC '98 Proceedings of the 7th International Conference on Compiler Construction
Performance Study of a Concurrent Multithreaded Processor
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Locality Analysis For Parallel C Programs
PACT '97 Proceedings of the 1997 International Conference on Parallel Architectures and Compilation Techniques
Heap Analysis And Optimizations For Threaded Programs
PACT '97 Proceedings of the 1997 International Conference on Parallel Architectures and Compilation Techniques
Compiling C for the EARTH Multithreaded Architecture
PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Compiling for multithreaded architectures
Compiling for multithreaded architectures
Compiler and Runtime Support for Irregular Reductions on a Multithreaded Architecture
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Supporting microthread scheduling and synchronisation in CMPs
International Journal of Parallel Programming
Hi-index | 0.00 |
Multithreaded architectures are emerging as an important class of parallel machines. By allowing fast context switching between threads on the same processor, these systems hide communication and synchronization latencies and allow scalable parallelism for dynamic and irregular applications. Thread partitioning is the most important task in compiling high-level languages for multithreaded architectures. Non-preemptive multithreaded architectures, which can be built from off-the-shelf components, require that if a thread issues a potentially remote memory request, then any statement that is dependent upon this request must be in a separate thread.When performing thread partitioning on codes that use pointer-based recursive data structures, it is often difficult to extract accurate dependence information. As a result, threads of unnecessarily small granularity get generated, which, because of thread switching costs, leads to increased execution time. In this paper, we present three techniques that lead to improved extraction and representation of dependence information in the presence of structured control flow, references through fields of structures, and pointer-based data structures. The benefit of these techniques is the generation of coarser-grained threads and, therefore, decreased execution time. Our experiments were performed using the EARTH-C compiler and the EARTH multithreaded architecture model emulated on both a cluster of Pentium PCs and a distributed memory multiprocessor. On our set of 6 pointer-based programs, these techniques reduced the static number of threads by 38%. Reductions in execution times ranged from 16% to 45% on the four programs we measured runtime performance.