Inside the Java Virtual Machine
Inside the Java Virtual Machine
Java Virtual Machine Specification
Java Virtual Machine Specification
Java Microarchitectures
Adapting Tomasulo's algorithm for bytecode folding based Java processors
ACM SIGARCH Computer Architecture News - Special Issue: PACT 2001 workshops
Exploiting Java-ILP on a Simultaneous Multi-Trace Instruction Issue (SMTI) Processor
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
IEEE Transactions on Consumer Electronics
A predecoding technique for ILP exploitation in Java processors
Journal of Systems Architecture: the EUROMICRO Journal
Exploiting an abstract-machine-based framework in the design of a Java ILP processor
Journal of Systems Architecture: the EUROMICRO Journal
Application requirements and efficiency of embedded Java bytecode multi-cores
Proceedings of the 8th International Workshop on Java Technologies for Real-Time and Embedded Systems
Hi-index | 0.00 |
Java processors have been introduced to offer hardware acceleration for java applications. They execute java bytecodes directly in hardware. However, the stack nature of the java virtual machine instruction set imposes a limitation on the achievable execution performance. If we intend to exploit instruction level parallelism, we must remove the stack completely. This can be achieved by recursive stack folding algorithms, such as OPEX, which dynamically transform groups of java bytecodes to RISC like instructions. However, the decoding throughputs that are obtained are limited. In this paper we propose a novel stack folding technique, that uses a predecoded cache to store folded bytecodes, thus enabling reuse. The decoding throughput reaches 4 RISC instructions per cycle. With use of a superscalar backend core, the obtained IPC is approximately 2.08 instructions per cycle (or 3.02 java bytecodes per cycle).