FPGA prototyping of a RISC processor core for embedded applications
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Mambo: a full system simulator for the PowerPC architecture
ACM SIGMETRICS Performance Evaluation Review - Special issue on tools for computer architecture research
Intel nehalem processor core made FPGA synthesizable
Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
The IBM Blue Gene/Q interconnection network and message unit
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
The IBM Blue Gene/Q Interconnection Fabric
IEEE Micro
Design of the IBM Blue Gene/Q compute chip
IBM Journal of Research and Development
IBM Blue Gene/Q memory subsystem with speculative execution and transactional memory
IBM Journal of Research and Development
Design of the IBM Blue Gene/Q compute chip
IBM Journal of Research and Development
Hi-index | 0.00 |
Major architectural innovations in the compute node have been introduced in the IBM Blue Gene®/Q, including programmable Level 1 (L1) cache data prefetching units to hide memory access latency, hardware support for transactional memory (TM) and speculative execution (SE), an enhanced five-dimensional integrated torus network, and a high-performance quad floating-point SIMD (single-instruction, multiple-data) unit. In this paper, we present the tools and methodology that we used to model, co-design, and validate these new features from early concept phase through design implementation. Early in the design cycle, we made extensive use of an architectural simulator, BGQSim, capable of executing unmodified binary Blue Gene/Q code for single as well as multiple nodes. As the hardware description language for the chip implementation became available, we complemented BGQSim with a cycle-accurate and cycle-reproducible, large-scale field-programmable gate array-based platform, Twinstar, to validate the implementation against performance targets and functional specifications. Through specific examples, we show the effectiveness of these tools in co-developing the hardware and software of Blue Gene/Q, allowing us to meet the design targets at an aggressive project schedule.