Hardware/software co-design with the HMS framework
Journal of VLSI Signal Processing Systems
Power considerations in the design of the Alpha 21264 microprocessor
DAC '98 Proceedings of the 35th annual Design Automation Conference
Reducing power in high-performance microprocessors
DAC '98 Proceedings of the 35th annual Design Automation Conference
Stack and Queue Layouts of Directed Acyclic Graphs: Part I
SIAM Journal on Computing
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Readings in Hardware/Software Co-Design
Readings in Hardware/Software Co-Design
The stratixπ routing and logic architecture
FPGA '03 Proceedings of the 2003 ACM/SIGDA eleventh international symposium on Field programmable gate arrays
Functional verification of the superscalar SH-4 microprocessor
COMPCON '97 Proceedings of the 42nd IEEE International Computer Conference
Queue Machines: Hardware Compilation in Hardware
FCCM '02 Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
The Design of a Register Renaming Unit
GLS '99 Proceedings of the Ninth Great Lakes Symposium on VLSI
Queue Processor Architecture for Novel Queue Computing Paradigm Based on Produced Order Scheme
HPCASIA '04 Proceedings of the High Performance Computing and Grid in Asia Pacific Region, Seventh International Conference
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Parallel Queue Processor Architecture Based on Produced Order Computation Model
The Journal of Supercomputing
High-Level Modeling and FPGA Prototyping of Produced Order Parallel Queue Processor Core
The Journal of Supercomputing
An efficient dynamic switching mechanism (DSM) for hybrid processor architecture
EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing
Analysis of Multi-Sort Algorithm on Multi-Mesh of Trees (MMT) architecture
The Journal of Supercomputing
Hi-index | 0.00 |
In this research work, we propose a novel embedded dual-execution mode 32-bit processor architecture (QSP32), which supports queue and stack programming models. The QSP32 core is based on a high performance produced order parallel queue architecture and is targeted for applications constrained in terms of area, memory, and power requirements. The design focuses on the ability to execute queue programs and also to support stack programs without a considerable increase in hardware to the base queue architecture. A prototype implementation of the processor is produced by synthesizing the high level model for a target FPGA device. We present the architecture description and design results in a fair amount of details. From the design and evaluation results, the QSP32 core efficiently executes both queue and stack based programs and achieves on average about 65 MHz speed. In addition, when compared to the base single-mode architecture (PQP), the QSP32 core requires only about 2.41% additional hardware. Moreover, the prototype fits on a single FPGA device, thereby eliminating the need to perform multi-chip partitioning which results in a loss of resource efficiency.