Hard enumeration problems in geometry and combinatorics
SIAM Journal on Algebraic and Discrete Methods
The linearity of first-fit coloring of interval graphs
SIAM Journal on Discrete Mathematics
A polynomial time approximation algorithm for Dynamic Storage Allocation
Discrete Mathematics
Improvements to graph coloring register allocation
ACM Transactions on Programming Languages and Systems (TOPLAS)
ACM Transactions on Programming Languages and Systems (TOPLAS)
Generating the acyclic orientations of a graph
Journal of Algorithms
Automatic storage management for parallel programs
Parallel Computing - Special issues on languages and compilers for parallel computers
A bandwidth-efficient architecture for media processing
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Algorithms for compile-time memory optimization
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Generating all the acyclic orientations of an undirected graph
Information Processing Letters
Topological sorting of large networks
Communications of the ACM
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Approximation Algorithms for Dynamic Storage Allocations
ESA '96 Proceedings of the Fourth Annual European Symposium on Algorithms
Scratchpad memory: design alternative for cache on-chip memory in embedded systems
Proceedings of the tenth international symposium on Hardware/software codesign
OPT versus LOAD in dynamic storage allocation
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
Automatic storage optimization
SIGPLAN '79 Proceedings of the 1979 SIGPLAN symposium on Compiler construction
Register allocation & spilling via graph coloring
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Media Processing Applications on the Imagine Stream Processor
ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57)
Algorithmic Graph Theory and Perfect Graphs (Annals of Discrete Mathematics, Vol 57)
ACM Transactions on Design Automation of Electronic Systems (TODAES)
A generalized algorithm for graph-coloring register allocation
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Brook for GPUs: stream computing on graphics hardware
ACM SIGGRAPH 2004 Papers
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Merrimac: Supercomputing with Streams
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Memory Coloring: A Compiler Approach for Scratchpad Memory Management
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Stream Programming on General-Purpose Processors
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
The potential of the cell processor for scientific computing
Proceedings of the 3rd conference on Computing frontiers
Compiling for stream processing
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
A 64-bit stream processor architecture for scientific applications
Proceedings of the 34th annual international symposium on Computer architecture
Comparing memory systems for chip multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
Scratchpad allocation for data aggregates in superperfect graphs
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Streamware: programming general-purpose multicore processors using streams
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Orchestrating the execution of stream programs on multicore platforms
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Optimizing scientific application loops on stream processors
Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems
Exploiting loop-dependent stream reuse for stream processors
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Compiler-directed scratchpad memory management via graph coloring
ACM Transactions on Architecture and Code Optimization (TACO)
Register allocation on stream processor with local register file
ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
Hi-index | 0.00 |
The stream processors represent a promising alternative to traditional cache-based general-purpose processors in achieving high performance in stream applications (media and some scientific applications). In a stream programming model for stream processors, an application is decomposed into a sequence of kernels operating on streams of data. During the execution of a kernel on a stream processor, all streams accessed must be communicated through a nonbypassing software-managed on-chip memory, the SRF (Stream Register File). Optimizing utilization of the scarce on-chip memory is crucial for good performance. The key insight is that the interference graphs (IGs) formed by the streams in stream applications tend to be comparability graphs or decomposable into a set of comparability graphs. We present a compiler algorithm for finding optimal or near-optimal colorings, that is, SRF allocations in stream IGs, by computing a maximum spanning forest of the sub-IG formed by long live ranges, if necessary. Our experimental results validate the optimality and near-optimality of our algorithm by comparing it with an ILP solver, and show that our algorithm yields improved SRF utilization over the First-Fit bin-packing algorithm, the best in the literature.