Two-level hierarchical register file organization for VLIW processors

Authors:
Javier Zalamea;Josep Llosa;Eduard Ayguadé;Mateo Valero
Affiliations:
Departament d'Arquitectura de Computadors (UPC), Universitat Politécnica de Catalunya;Departament d'Arquitectura de Computadors (UPC), Universitat Politécnica de Catalunya;Departament d'Arquitectura de Computadors (UPC), Universitat Politécnica de Catalunya;Departament d'Arquitectura de Computadors (UPC), Universitat Politécnica de Catalunya
Venue:
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Year:
2000

Citing 24
Cited 32

Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Hierarchical registers for scientific computers

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Software prefetching

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
IMPACT: an architectural framework for multiple-instruction-issue processors

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Register allocation for software pipelined loops

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Partitioned register files for VLIWs: a preliminary analysis of tradeoffs

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Lifetime-sensitive modulo scheduling

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Instruction-level parallel processing: history, overview, and perspective

The Journal of Supercomputing - Special issue on instruction-level parallelism
Compiling for the Cydra 5

The Journal of Supercomputing - Special issue on instruction-level parallelism
The superblock: an effective technique for VLIW and superscalar compilation

The Journal of Supercomputing - Special issue on instruction-level parallelism
Iterative modulo scheduling: an algorithm for software pipelining loops

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Stage scheduling: a technique to reduce the register requirements of a modulo schedule

Proceedings of the 28th annual international symposium on Microarchitecture
Hypernode reduction modulo scheduling

Proceedings of the 28th annual international symposium on Microarchitecture
Heuristics for register-constrained software pipelining

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Cache sensitive modulo scheduling

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Quantitative Evaluation of Register Pressure on Software Pipelined Loops

International Journal of Parallel Programming
Compiler-controlled memory

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Multiple-banked register file architectures

Proceedings of the 27th annual international symposium on Computer architecture
Improved spill code generation for software pipelined loops

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Computer Structures: Principles and Examples

Computer Structures: Principles and Examples
The Alpha 21264 Microprocessor

IEEE Micro
Using Sacks to Organize Registers in VLIW Machines

CONPAR 94 - VAPP VI Proceedings of the Third Joint International Conference on Vector and Parallel Processing: Parallel Processing
Non-Consistent Dual Register Files to Reduce Register Pressure

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Partitioned Schedules for Clustered VLIW Architectures

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium

A large, fast instruction window for tolerating cache misses

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Reducing the complexity of the register file in dynamic superscalar processors

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
A Register File Architecture and Compilation Scheme for Clustered ILP Processors

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Cherry: checkpointed early resource recycling in out-of-order microprocessors

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Dynamic addressing memory arrays with physical locality

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Reducing register ports for higher speed and lower energy

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Reducing register ports using delayed write-back queues and operand pre-fetch

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
The microarchitecture of a low power register file

Proceedings of the 2003 international symposium on Low power electronics and design
Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Reducing register pressure through LAER algorithm

ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
Hardware-managed register allocation for embedded processors

Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Use-Based Register Caching with Decoupled Indexing

Proceedings of the 31st annual international symposium on Computer architecture
Physical Register Inlining

Proceedings of the 31st annual international symposium on Computer architecture
Late Allocation and Early Release of Physical Registers

IEEE Transactions on Computers
Differential register allocation

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Dual-Core Execution: Building a Highly Scalable Single-Thread Instruction Window

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Efficient Use of Invisible Registers in Thumb Code

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Software and hardware techniques to optimize register file utilization in VLIW architectures

International Journal of Parallel Programming
Allocating architected registers through differential encoding

ACM Transactions on Programming Languages and Systems (TOPLAS)
Compacting register file via 2-level renaming and bit-partitioning

Microprocessors & Microsystems
Unified microprocessor core storage

Proceedings of the 4th international conference on Computing frontiers
Asymmetrically banked value-aware register files for low-energy and high-performance

Microprocessors & Microsystems
Achieving Out-of-Order Performance with Almost In-Order Complexity

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Power-efficient clustering via incomplete bypassing

Proceedings of the 13th international symposium on Low power electronics and design
A configurable multi-ported register file architecture for soft processor cores

ARC'07 Proceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications
Decoupled state-execute architecture

ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Register pressure in software-pipelined loop nests: fast computation and impact on architecture design

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
A compile-time managed multi-level register file hierarchy

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
An optimized front-end physical register file with banking and writeback filtering

PACS'04 Proceedings of the 4th international conference on Power-Aware Computer Systems
A Hierarchical Thread Scheduler and Register File for Energy-Efficient Throughput Processors

ACM Transactions on Computer Systems (TOCS)
Code generation for an application-specific VLIW processor with clustered, addressable register files

Proceedings of the 10th Workshop on Optimizations for DSP and Embedded Systems
Shared-port register file architecture for low-energy VLIW processors

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Two-level hierarchical register file organization for VLIW processors

Quantified Score

Visualization

Abstract