A predecoding technique for ILP exploitation in Java processors

Authors:
Isidoros Sideris;Kiamal Pekmestzi;George Economakos
Affiliations:
School of Electrical and Computer Engineering, National Technical University of Athens, 9 Heroon Polytechneiou, Athens 15780, Greece;School of Electrical and Computer Engineering, National Technical University of Athens, 9 Heroon Polytechneiou, Athens 15780, Greece;School of Electrical and Computer Engineering, National Technical University of Athens, 9 Heroon Polytechneiou, Athens 15780, Greece
Venue:
Journal of Systems Architecture: the EUROMICRO Journal
Year:
2008

Citing 11
Cited 2

Allowing for ILP in an embedded Java processor

Proceedings of the 27th annual international symposium on Computer architecture
Java Virtual Machine Specification

Java Virtual Machine Specification
Java Microarchitectures

Java Microarchitectures
Adapting Tomasulo's algorithm for bytecode folding based Java processors

ACM SIGARCH Computer Architecture News - Special Issue: PACT 2001 workshops
PicoJava: A Direct Execution Engine For Java Bytecode

Computer
Compiling Java Just in Time

IEEE Micro
Exploiting Java-ILP on a Simultaneous Multi-Trace Instruction Issue (SMTI) Processor

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Speculative Execution In High Performance Computer Architectures (Chapman & Hall/Crc Computer & Information Science Series)

Speculative Execution In High Performance Computer Architectures (Chapman & Hall/Crc Computer & Information Science Series)
A cache based stack folding technique for high performance Java processors

JTRES '06 Proceedings of the 4th international workshop on Java technologies for real-time and embedded systems
A Java processor architecture for embedded real-time systems

Journal of Systems Architecture: the EUROMICRO Journal
The JAFARDD processor: a Java architecture based on a Folding Algorithm, with Reservation stations, Dynamic translation, and Dual processing

IEEE Transactions on Consumer Electronics

Extending an embedded RISC microprocessor for efficient translation based Java execution

Microprocessors & Microsystems
A hardware peripheral for Java bytecodes translation acceleration

Proceedings of the 2010 ACM Symposium on Applied Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Java processors have been introduced to offer hardware acceleration for Java applications. They execute Java bytecodes directly in hardware. However, the stack nature of the Java virtual machine instruction set imposes a limitation on the achievable execution performance. In order to exploit instruction level parallelism and allow out of order execution, we must remove the stack completely. This can be achieved by recursive stack folding algorithms, such as OPEX, which dynamically transform groups of Java bytecodes to RISC like instructions. However, the decoding throughputs that are obtained are limited. In this paper, we explore microarchitectural techniques to improve the decoding throughput of Java processors. Our techniques are based on the use of a predecoded cache to store the folding results, so that it could be reused. The ultimate goal is to exploit every possible instruction level parallelism in Java programs by having a superscalar out of order core in the backend being fed at a sustainable rate. With the use of a predecoded cache of 2x2048 entries and a 4-way superscalar core we have from 4.8 to 18.3 times better performance than an architecture employing pattern based folding.