Hardware Realization of a Java Virtual Machine for High Performance Multimedia Applications
Journal of VLSI Signal Processing Systems - Special issue on the 1997 IEEE workshop on signal processing systems (SiPS): design and implementation
Allowing for ILP in an embedded Java processor
Proceedings of the 27th annual international symposium on Computer architecture
Adapting Tomasulo's algorithm for bytecode folding based Java processors
ACM SIGARCH Computer Architecture News - Special Issue: PACT 2001 workshops
A VLIW low power Java processor for embedded applications
SBCCI '04 Proceedings of the 17th symposium on Integrated circuits and system design
A cache based stack folding technique for high performance Java processors
JTRES '06 Proceedings of the 4th international workshop on Java technologies for real-time and embedded systems
Bump-pointer method caching for embedded Java processors
JTRES '07 Proceedings of the 5th international workshop on Java technologies for real-time and embedded systems
Secure, Real-Time and Multi-Threaded General-Purpose Embedded Java Microarchitecture
DSD '07 Proceedings of the 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools
Evaluation of Different Multithreaded and Multicore Processor Configurations for SoPC
SAMOS '09 Proceedings of the 9th International Workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
On the scalability of time-predictable chip-multiprocessing
Proceedings of the 10th International Workshop on Java Technologies for Real-time and Embedded Systems
Hi-index | 0.00 |
This paper introduces a new Java Bytecode Multi-Core System-on-a-Chip architecture which scales well in chip-area and performance. Especially, the area efficiency is greater 1 (about 120%), demonstrating that we gained a higher speed-up compared to the additional hardware costs. Based on the evaluation of four different applications, the cores are connected to the shared heap by a full-duplex bus with pipelined transactions. Each multi-threaded realtime-capable core is equipped with local on-chip memory for the Java operand stack and a method cache to further reduce the memory bandwidth requirements. As opposed to related projects, synchronization is supported on a per object-basis (independent locks) instead of a single global lock. Application threads are distributed automatically using a round-robin scheme. The multi-port memory manager includes an exact and fully concurrent garbage collector for automatic memory management. The design can be synthesized for a variable number of parallel cores and shows a linear increase in chip-space. Speed-up and area-efficiency are measured for the same four different applications and are compared to related projects.