An Adaptation of the Fast Fourier Transform for Parallel Processing
Journal of the ACM (JACM)
Using many-core hardware to correlate radio astronomy signals
Proceedings of the 23rd international conference on Supercomputing
High Performance Matrix Multiplication on Many Cores
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
The multikernel: a new OS architecture for scalable multicore systems
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Exploiting fine-grain thread parallelism on multicore architectures
Scientific Programming - Software Development for Multi-core Computing Systems
State-of-the-art in heterogeneous computing
Scientific Programming
The 48-core SCC Processor: the Programmer's View
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
LOFT: A High Performance Network-on-Chip Providing Quality-of-Service Support
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
PCFS: Power Credit Based Fair Scheduler Under DVFS for Muliticore Virtualization Platform
GREENCOM-CPSCOM '10 Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing
Strategies for preparing computer science students for the multicore world
Proceedings of the 2010 ITiCSE working group reports
Automatic code overlay generation and partially redundant code fetch elimination
ACM Transactions on Architecture and Code Optimization (TACO)
High-performance RMA-based broadcast on the intel SCC
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Towards network-on-chip agreement protocols
Proceedings of the tenth ACM international conference on Embedded software
Direct approaches to exploit many-core architecture in bioinformatics
Future Generation Computer Systems
Elemental: A New Framework for Distributed Memory Dense Matrix Computations
ACM Transactions on Mathematical Software (TOMS)
Proceedings of the First International Workshop on Many-core Embedded Systems
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Adaptive communication mechanism for accelerating MPI functions in NoC-based multicore processors
ACM Transactions on Architecture and Code Optimization (TACO)
The Journal of Supercomputing
Hi-index | 0.00 |
Intel's 80-core Terascale Processor was the first generally programmable microprocessor to break the Teraflops barrier. The primary goal for the chip was to study power management and on-die communication technologies. When announced in 2007, it received a great deal of attention for running a stencil kernel at 1.0 single precision TFLOPS while using only 97 Watts. The literature about the chip, however, focused on the hardware, saying little about the software environment or the kernels used to evaluate the chip.