A fully-asynchronous low-power framework for GALS NoC integration
Proceedings of the Conference on Design, Automation and Test in Europe
Proceedings of the Conference on Design, Automation and Test in Europe
Variation-tolerant OpenMP tasking on tightly-coupled processor clusters
Proceedings of the Conference on Design, Automation and Test in Europe
Improving simulation speed and accuracy for many-core embedded platforms with ensemble models
Proceedings of the Conference on Design, Automation and Test in Europe
Designing tightly-coupled extension units for the STxP70 processor
Proceedings of the Conference on Design, Automation and Test in Europe
Fast and accurate TLM simulations using temporal decoupling for FIFO-based communications
Proceedings of the Conference on Design, Automation and Test in Europe
3D-MMC: a modular 3D multi-core architecture with efficient resource pooling
Proceedings of the Conference on Design, Automation and Test in Europe
ARTM: a lightweight fork-join framework for many-core embedded systems
Proceedings of the Conference on Design, Automation and Test in Europe
Exploring parallelization for medium access schemes on many-core software defined radio architecture
Proceedings of the second workshop on Software radio implementation forum
Energy and performance exploration of accelerator coherency port using Xilinx ZYNQ
Proceedings of the 10th FPGAworld Conference
HARS: A hardware-assisted runtime software for embedded many-core architectures
ACM Transactions on Embedded Computing Systems (TECS) - Special Issue on Design Challenges for Many-Core Processors, Special Section on ESTIMedia'13 and Regular Papers
Maximum-throughput mapping of SDFGs on multi-core SoC platforms
Journal of Parallel and Distributed Computing
ACM SIGAda Ada Letters
Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
BPDF: a statically analyzable DataFlow model with integer and Boolean parameters
Proceedings of the Eleventh ACM International Conference on Embedded Software
A compiler infrastructure for embedded heterogeneous MPSoCs
Parallel Computing
Numerical Aspects of MIMO OFDM PHY Layer Applications on SDR Platforms
Journal of Signal Processing Systems
Hi-index | 0.00 |
P2012 is an area- and power-efficient many-core computing fabric based on multiple globally asynchronous, locally synchronous (GALS) clusters supporting aggressive fine-grained power, reliability and variability management. Clusters feature up to 16 processors and one control processor with independent instruction streams sharing a multi-banked L1 data memory, a multi-channel DMA engine, and specialized hardware for synchronization and scheduling. P2012 achieves extreme area and energy efficiency by supporting domain-specific acceleration at the processor and cluster level through the addition of dedicated HW IPs. P2012 can run standard OpenCL and OpenMP parallel codes well as proprietary Native Programming Model (NPM) SW components that provide the highest level of control on application-to-resource mapping. In Q3 2011 the P2012 SW Development Kit (SDK) has been made available to a community of R&D users; it includes full OpenCL and NPM development environments. The first P2012 SoC prototype in 28nm CMOS will sample in Q4 2012, featuring four clusters and delivering 80GOPS (with single precision floating point support) in 15.2mm2 with 2W power consumption.