The network architecture of the Connection Machine CM-5 (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Communications of the ACM
Smart Memories: a modular reconfigurable architecture
Proceedings of the 27th annual international symposium on Computer architecture
Directed diffusion: a scalable and robust communication paradigm for sensor networks
MobiCom '00 Proceedings of the 6th annual international conference on Mobile computing and networking
Reverse path forwarding of broadcast packets
Communications of the ACM
NanoFabrics: spatial computing using molecular electronics
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Tarantula: a vector extension to the alpha architecture
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
The Art of Computer Programming, 2nd Ed. (Addison-Wesley Series in Computer Science and Information
The Art of Computer Programming, 2nd Ed. (Addison-Wesley Series in Computer Science and Information
The Teramac Custom Computer: Extending the Limits with Defect Tolerance
DFT '96 Proceedings of the 1996 Workshop on Defect and Fault-Tolerance in VLSI Systems
ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
The Reconfigurable Streaming Vector Processor (RSVPTM)
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Self-assembled computer architecture: design and fabrication theory
Self-assembled computer architecture: design and fabrication theory
Power Efficient Processor Architecture and The Cell Processor
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
NANA: A nano-scale active network architecture
ACM Journal on Emerging Technologies in Computing Systems (JETC)
A self-organizing defect tolerant SIMD architecture
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Emerging Models of Computation: Directions in Molecular Computing
Software-Intensive Systems and New Computing Paradigms
Architectural implications of nanoscale integrated sensing and computing
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Reconfigurable double gate carbon nanotube field effect transistor based nanoelectronic architecture
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Routing in self-organizing nano-scale irregular networks
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Proceedings of the Conference on Design, Automation and Test in Europe
Impact of nanomanufacturing flow on systematic yield losses in nanoscale fabrics
NANOARCH '11 Proceedings of the 2011 IEEE/ACM International Symposium on Nanoscale Architectures
Hi-index | 0.00 |
The continual decrease in transistor size (through either scaled CMOS or emerging nano-technologies) promises to usher in an era of tera to peta-scale integration. However, this decrease in size is also likely to increase defect densities, contributing to the exponentially increasing cost of top-down lithography. Bottom-up manufacturing techniques, like self assembly, may provide a viable lower-cost alternative to top-down lithography, but may also be prone to higher defects. Therefore, regardless of fabrication methodology, defect tolerant architectures are necessary to exploit the full potential of future increased device densities.This paper explores a defect tolerant SIMD architecture. A key feature of our design is the ability of a large number of limited capability nodes with high defect rates (up to 30%) to self-organize into a set of SIMD processing elements. Despite node simplicity and high defect rates, we show that by supporting the familiar data parallel programming model the architecture can execute a variety of programs. The architecture efficiently exploits a large number of nodes and higher device densities to keep device switching speeds and power density low. On a medium sized system (~1cm2 area), the performance of the proposed architecture on our data parallel programs matches or exceeds the performance of an aggressively scaled out-of-order processor (128-wide, 8k reorder buffer, perfect memory system). For larger systems (1cm2), the proposed architecture can match the performance of a chip multiprocessor with 16 aggressively scaled out-of-order cores.