A Mapping Strategy for Parallel Processing
IEEE Transactions on Computers
A VLIW architecture for a trace Scheduling Compiler
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
Efficient algorithm for graph-partitioning problem using a problem transformation method
Computer-Aided Design
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
A linear-time heuristic for improving network partitions
DAC '82 Proceedings of the 19th Design Automation Conference
Bulldog: a compiler for vliw architectures (parallel computing, reduced-instruction-set, trace scheduling, scientific)
Register file port requirements of transport triggered architectures
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
The performance impact of incomplete bypassing in processor pipelines
Proceedings of the 28th annual international symposium on Microarchitecture
Exploiting short-lived variables in superscalar processors
Proceedings of the 28th annual international symposium on Microarchitecture
Partitioned register file for TTAs
Proceedings of the 28th annual international symposium on Microarchitecture
Hypernode reduction modulo scheduling
Proceedings of the 28th annual international symposium on Microarchitecture
Custom-fit processors: letting applications define architectures
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Exploiting idle floating-point resources for integer execution
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Quantitative Evaluation of Register Pressure on Software Pipelined Loops
International Journal of Parallel Programming
Effective cluster assignment for modulo scheduling
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Widening resources: a cost-effective technique for aggressive ILP architectures
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Space-time scheduling of instruction-level parallelism on a raw machine
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Software-Directed Register Deallocation for Simultaneous Multithreaded Processors
IEEE Transactions on Parallel and Distributed Systems
ACM SIGPLAN Notices
Modulo scheduling for a fully-distributed clustered VLIW architecture
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Two-level hierarchical register file organization for VLIW processors
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
High-quality operation binding for clustered VLIW datapaths
Proceedings of the 38th annual Design Automation Conference
Loop Transformations for Architectures with Partitioned Register Banks
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Instruction scheduling for clustered VLIW architectures
ISSS '00 Proceedings of the 13th international symposium on System synthesis
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
IEEE Transactions on Computers
Affinity-based cluster assignment for unrolled loops
ICS '02 Proceedings of the 16th international conference on Supercomputing
Graph-partitioning based instruction scheduling for clustered processors
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Modulo scheduling with integrated register spilling for clustered VLIW architectures
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Reducing the complexity of the register file in dynamic superscalar processors
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Cluster assignment for high-performance embedded VLIW processors
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Optimizing Loop Performance for Clustered VLIW Architectures
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Exploiting Pseudo-Schedules to Guide Data Dependence Graph Partitioning
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
A Register File Architecture and Compilation Scheme for Clustered ILP Processors
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Reducing register ports for higher speed and lower energy
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Region-based hierarchical operation partitioning for multicluster processors
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Non-Consistent Dual Register Files to Reduce Register Pressure
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
The Effectiveness of Loop Unrolling for Modulo Scheduling in Clustered VLIW Architectures
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Dynamically managing the communication-parallelism trade-off in future clustered processors
Proceedings of the 30th annual international symposium on Computer architecture
Partitioned Schedules for Clustered VLIW Architectures
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
A scalable wide-issue clustered VLIW with a reconfigurable interconnect
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Reducing register pressure through LAER algorithm
ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
Instruction buffering exploration for low energy VLIWs with instruction clusters
Proceedings of the 2004 Asia and South Pacific Design Automation Conference
An analysis of a resource efficient checkpoint architecture
ACM Transactions on Architecture and Code Optimization (TACO)
Software and hardware techniques to optimize register file utilization in VLIW architectures
International Journal of Parallel Programming
Compiler-directed Data Partitioning for Multicluster Processors
Proceedings of the International Symposium on Code Generation and Optimization
Virtual Cluster Scheduling Through the Scheduling Graph
Proceedings of the International Symposium on Code Generation and Optimization
Heterogeneous Clustered VLIW Microarchitectures
Proceedings of the International Symposium on Code Generation and Optimization
EURASIP Journal on Applied Signal Processing
Building a large instruction window through ROB compression
MEDEA '07 Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture
Communication optimizations for global multi-threaded instruction scheduling
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Journal of Signal Processing Systems
Asymmetrically banked value-aware register files for low-energy and high-performance
Microprocessors & Microsystems
Reducing complexity of multiobjective design space exploration in VLIW-based embedded systems
ACM Transactions on Architecture and Code Optimization (TACO)
A Multi-Shared Register File Structure for VLIW Processors
Journal of Signal Processing Systems
Data pipeline optimization for shared memory multiple-SIMD architecture
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Mt-ADRES: multithreading on coarse-grained reconfigurable architecture
ARC'07 Proceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications
Exploiting narrow-width values for thermal-aware register file designs
Proceedings of the Conference on Design, Automation and Test in Europe
CROB: implementing a large instruction window through compression
Transactions on high-performance embedded architectures and compilers III
HiPC'05 Proceedings of the 12th international conference on High Performance Computing
SIMD defragmenter: efficient ILP realization on data-parallel architectures
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Compiler-assisted energy optimization for clustered VLIW processors
Journal of Parallel and Distributed Computing
Compiler supports for VLIW DSP processors with SIMD intrinsics
Concurrency and Computation: Practice & Experience
LUCAS: latency-adaptive unified cluster assignment and instruction scheduling
Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Shared-port register file architecture for low-energy VLIW processors
ACM Transactions on Architecture and Code Optimization (TACO)
CAeSaR: unified cluster-assignment scheduling and communication reuse for clustered VLIW processors
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Hi-index | 0.00 |