Cashmere-2L: software coherent shared memory on a clustered remote-write network
Proceedings of the sixteenth ACM symposium on Operating systems principles
Analysis and development of Java Grande benchmarks
JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
A parallel java grande benchmark suite
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Power Efficient Processor Architecture and The Cell Processor
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
The DaCapo benchmarks: java benchmarking development and analysis
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Enabling scalability and performance in a large scale CMP environment
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Characterizing the Cell EIB On-Chip Network
IEEE Micro
Cell broadband engine architecture and its first implementation: a performance view
IBM Journal of Research and Development
CellSs: making it easier to program the cell broadband engine processor
IBM Journal of Research and Development
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Amdahl's Law in the Multicore Era
Computer
Validity of the single processor approach to achieving large scale computing capabilities
AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
SPECjvm2008 Performance Characterization
Proceedings of the 2009 SPEC Benchmark Workshop on Computer Performance Evaluation and Benchmarking
A comprehensive scheduler for asymmetric multicore systems
Proceedings of the 5th European conference on Computer systems
Auto-parallelisation of sieve C++ programs
Euro-Par'07 Proceedings of the 2007 conference on Parallel processing
Hera-JVM: abstracting processor heterogeneity behind a virtual machine
HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
Proceedings of the 2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Which problems does a multi-language virtual machine need to solve in the multicore/manycore era?
Proceedings of the compilation of the co-located workshops on DSM'11, TMC'11, AGERE!'11, AOOPES'11, NEAT'11, & VMIL'11
Reflex: using low-power processors in smartphones without knowing them
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
COMET: code offload by migrating execution transparently
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Explicit Java control of low-power heterogeneous parallel processing in the ToucHMore project
Proceedings of the 11th International Workshop on Java Technologies for Real-time and Embedded Systems
Efficient Mapping of Irregular C++ Applications to Integrated GPUs
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Hi-index | 0.00 |
Heterogeneous multi-core processors, such as the IBM Cell processor, can deliver high performance. However, these processors are notoriously difficult to program: different cores support different instruction set architectures, and the processor as a whole does not provide coherence between the different cores' local memories. We present Hera-JVM, an implementation of the Java Virtual Machine which operates over the Cell processor, thereby making this platforms more readily accessible to mainstream developers. Hera-JVM supports the full Java language; threads from an unmodified Java application can be simultaneously executed on both the main PowerPC-based core and on the additional SPE accelerator cores. Migration of threads between these cores is transparent from the point of view of the application, requiring no modification to Java source code or bytecode. Hera-JVM supports the existing Java Memory Model, even though the underlying hardware does not provide cache coherence between the different core types. We examine Hera-JVM's performance under a series of real-world Java benchmarks from the SpecJVM, Java Grande and Dacapo benchmark suites. These benchmarks show a wide variation in relative performance on the different core types of the Cell processor, depending upon the nature of their workload. Execution of these benchmarks on Hera-JVM can achieve speedups of up to 2.25x by using one of the Cell processor's SPE accelerator cores, compared to execution on the main PowerPC-based core. When all six SPE cores are exploited, parallel workloads can achieve speedups of up to 13x compared to execution on the single PowerPC core.