Journal of the ACM (JACM)
An Adaptation of the Fast Fourier Transform for Parallel Processing
Journal of the ACM (JACM)
A model for masking rotational latency by dynamic disk allocation
Communications of the ACM
Control structures in Illiac IV Fortran
Communications of the ACM
Parallelism exposure and exploitation in programs
Parallelism exposure and exploitation in programs
Parallel Processing with the Perfect Shuffle
IEEE Transactions on Computers
Cellular Interconnection Arrays
IEEE Transactions on Computers
Time and Parallel Processor Bounds for Linear Recurrence Systems
IEEE Transactions on Computers
A Fast Computer Method for Matrix Transposing
IEEE Transactions on Computers
ILLIAC IV Software and Application Programming
IEEE Transactions on Computers
The Organization and Use of Parallel Memories
IEEE Transactions on Computers
Sorting networks and their applications
AFIPS '68 (Spring) Proceedings of the April 30--May 2, 1968, spring joint computer conference
Programmable indexing networks
AFIPS '70 (Spring) Proceedings of the May 5-7, 1970, spring joint computer conference
STARAN parallel processor system hardware
AFIPS '74 Proceedings of the May 6-10, 1974, national computer conference and exposition
Determining an Optimal Secondary Storage Service Rate for the PASM Control System
IEEE Transactions on Computers
Fault-Tolerant Multiprocessors with Redundant-Path Interconnection Networks
IEEE Transactions on Computers - The MIT Press scientific computation series
Performance of unbuffered shuffle-exchange networks
IEEE Transactions on Computers - The MIT Press scientific computation series
A connecting network with fault tolerance capabilities
IEEE Transactions on Computers - The MIT Press scientific computation series
Finite State Model and Compatibility Theory: New Analysis Tools for Permutation Networks
IEEE Transactions on Computers
Permutations on Illiac IV-Type Networks
IEEE Transactions on Computers
An Efficient Memory System for Image Processing
IEEE Transactions on Computers
The Load-Sharing Banyan Network
IEEE Transactions on Computers
Vector Computer Memory Bank Contention
IEEE Transactions on Computers
A new interconnection network for SIMD computers: the sigma networks
IEEE Transactions on Computers
On the permutation capability of multistage interconnection networks
IEEE Transactions on Computers
On Linear Skewing Schemes and d-Ordered Vectors
IEEE Transactions on Computers
Distributing Hot-Spot Addressing in Large-Scale Multiprocessors
IEEE Transactions on Computers
IEEE Transactions on Computers
IEEE Transactions on Computers
Performance analysis of the FFT algorithm on a shared-memory parallel architecture
IBM Journal of Research and Development
Best worst mappings for the omega network
IBM Journal of Research and Development
Applications considerations in the system design of highly concurrent multiprocessors
IEEE Transactions on Computers
An analytical model for a class of processor-memory interconnection networks
IEEE Transactions on Computers
Vector access performance in parallel memories using skewed storage scheme
IEEE Transactions on Computers
A Characterization and Analysis of Parallel Processor Interconnection Networks
IEEE Transactions on Computers
Discrete Optimization Problem in Local Networks and Data Alignment
IEEE Transactions on Computers
A New Benes Network Control Algorithm
IEEE Transactions on Computers
The periodic balanced sorting network
Journal of the ACM (JACM)
The SP2 high-performance switch
IBM Systems Journal
The shuffle/exchange-plus networks
ACM-SE 20 Proceedings of the 20th annual Southeast regional conference
On the Rearrangeability of Multistage Networks Employing Uniform Connection Patterns
ADVIS '00 Proceedings of the First International Conference on Advances in Information Systems
Job Scheduling for the BlueGene/L System
JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
A 2D Addressing Mode for Multimedia Applications
Embedded Processor Design Challenges: Systems, Architectures, Modeling, and Simulation - SAMOS
Reducing Cache Conflicts by a Parametrized Memory Mapping
ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
A Reconfigurable Stochastic Model Simulator for Analysis of Parallel Systems
FPL '00 Proceedings of the The Roadmap to Reconfigurable Computing, 10th International Workshop on Field-Programmable Logic and Applications
A Reconfigurable Stochastic Model Simulator for Analysis of Parallel Systems
FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
Performing BMMC Permutations in Two Passes through the Expanded Delta Network and MasPar MP-2
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
A journey into multicomputer routing algorithms
PAS '95 Proceedings of the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis
The Stereo Correspondence Problem on a Ring-based Network
PAS '97 Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms / Architecture Synthesis
Performance of Multicasting Closed Interconnection Networks
INFOCOM '97 Proceedings of the INFOCOM '97. Sixteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Driving the Information Revolution
Siamese-Twin: A Dynamically Fault-Tolerant Fat-Tree
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Multistage Interconnection Networks with Multiple Outlets
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Performance and Reliability of the Multistage Bus Network
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
A New Tag Scheme and Its Tree Representation for a Shuffle-Exchange Network
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
On the Rearrangeability of Reverse Shuffle/Exchange Networks
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Bounded-Collision Memory-Mapping Schemes for Data Structures with Applications to Parallel Memories
IEEE Transactions on Parallel and Distributed Systems
Memory Systems for Image Processing
IEEE Transactions on Computers
The NYU Ultracomputer Designing an MIMD Shared Memory Parallel Computer
IEEE Transactions on Computers
Data Exchange Optimization in Reconfigurable
IEEE Transactions on Computers
Parallel Processing Approaches to Image Correlation
IEEE Transactions on Computers
Asynchronous and Clocked Control Structures for VSLI Based Interconnection Networks
IEEE Transactions on Computers
Notes on Shuffle/Exchange-Type Switching Networks
IEEE Transactions on Computers
Communication Structures for Large Networks of Microcomputers
IEEE Transactions on Computers
Analysis and Simulation of Buffered Delta Networks
IEEE Transactions on Computers
VLSI Performance Comparison of Banyan and Crossbar Communications Networks
IEEE Transactions on Computers
An Easily Controlled Network for Frequently Used Permutations
IEEE Transactions on Computers
On the Number of Permutations Performable by the Augmented Data Manipulator Network
IEEE Transactions on Computers
Optimal BPC Permutations on a Cube Connected SIMD Computer
IEEE Transactions on Computers
IEEE Transactions on Computers
IEEE Transactions on Computers
Supersystems: Technology and Architecture
IEEE Transactions on Computers
The Prime Memory System for Array Access
IEEE Transactions on Computers
The Extra Stage Cube: A Fault-Tolerant Interconnection Network for Supersystems
IEEE Transactions on Computers
The Universality of the Shuffle-Exchange Network
IEEE Transactions on Computers
A Self-Routing Benes Network and Parallel Permutation Algorithms
IEEE Transactions on Computers
Theoretical Limitations on the Efficient Use of Parallel Memories
IEEE Transactions on Computers
The Indirect Binary n-Cube Microprocessor Array
IEEE Transactions on Computers
On the Effective Bandwidth of Parallel Memories
IEEE Transactions on Computers
Parallel Permutations of Data: A Benes Network Control Algorithm for Frequently Used Permutations
IEEE Transactions on Computers
Graph Theoretical Analysis and Design of Multistage Interconnection Networks
IEEE Transactions on Computers
High-Speed Multiprocessors and Compilation Techniques
IEEE Transactions on Computers
A Uniform Representation of Single-and Multistage Interconnection Networks Used in SIMD Machines
IEEE Transactions on Computers
The Theory Underlying the Partitioning of Permutation Networks
IEEE Transactions on Computers
The Reverse-Exchange Interconnection Network
IEEE Transactions on Computers
Fast Random and Sequential Access to Dynamic Memories of Any Size
IEEE Transactions on Computers
On a Class of Multistage Interconnection Networks
IEEE Transactions on Computers
Fault-Diagnosis for a Class of Multistage Interconnection Networks
IEEE Transactions on Computers
Performance of Processor-Memory Interconnections for Multiprocessors
IEEE Transactions on Computers
Task Preloading Schemes for Reconfigurable Parallel Processing Systems
IEEE Transactions on Computers
Routing Algorithms for Cellular Interconnection Arrays
IEEE Transactions on Computers
PUMPS Architecture for Pattern Analysis and Image Database Management
IEEE Transactions on Computers
Minimization of Interprocessor Communication for Parallel Computation
IEEE Transactions on Computers
A Practical Algorithm for the Solution of Triangular Systems on a Parallel Processing System
IEEE Transactions on Computers
Pin Limitations and Partitioning of VLSI Interconnection Networks
IEEE Transactions on Computers
Efficient Internode Communications in Reconfigurable Binary Trees
IEEE Transactions on Computers
A Two-Level Microprogrammed Multiprocessor Computer with Nonnumeric Functions
IEEE Transactions on Computers
Routing Schemes for the Augmented Data Manipulator Network in an MIMD System
IEEE Transactions on Computers
Hypertree: A Multiprocessor Interconnection Topology
IEEE Transactions on Computers
PASM: A Partitionable SIMD/MIMD System for Image Processing and Pattern Recognition
IEEE Transactions on Computers
The Lens Interconnection Strategy
IEEE Transactions on Computers
A Layout for the Shuffle-Exchange Network with O(N2/log3/2N) Area
IEEE Transactions on Computers
Design and Performance of Generalized Interconnection Networks
IEEE Transactions on Computers
The Performance of Multistage Interconnection Networks for Multiprocessors
IEEE Transactions on Computers
Fault Tolerant Interleaved Switching Fabrics For Scalable High-Performance Routers
IEEE Transactions on Parallel and Distributed Systems
Journal of Discrete Algorithms
Memory organization with multi-pattern parallel accesses
Proceedings of the conference on Design, automation and test in Europe
Performance and reliability analysis of new fault-tolerant advance omega network
WSEAS Transactions on Computers
EURASIP Journal on Wireless Communications and Networking - Advances in Error Control Coding Techniques
The architecture of MANIP: a parallel computer system for solving NP-complete problems
AFIPS '81 Proceedings of the May 4-7, 1981, national computer conference
Design and implementation of the banyan interconnection network in TRAC
AFIPS '80 Proceedings of the May 19-22, 1980, national computer conference
Conflict-free memory allocation for associative data files
AFIPS '83 Proceedings of the May 16-19, 1983, national computer conference
Distributed scheduling of resources on interconnection networks
AFIPS '82 Proceedings of the June 7-10, 1982, national computer conference
Permuting streaming data using RAMs
Journal of the ACM (JACM)
Reliability and performance analysis of new fault tolerant irregular network
WSEAS Transactions on Computer Research
A new min: fault-tolerant advance omega network
ICCOMP'08 Proceedings of the 12th WSEAS international conference on Computers
ARC '09 Proceedings of the 5th International Workshop on Reconfigurable Computing: Architectures, Tools and Applications
Evolutionary optimization of multistage interconnection networks performance
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
High-throughput Block Turbo Decoding: From Full-parallel Architecture to FPGA Prototyping
Journal of Signal Processing Systems
On rearrangeability of tandem connection of banyan-type networks
IEEE Transactions on Communications
Dynamic memories with faster random and sequential access
IBM Journal of Research and Development
IBM Journal of Research and Development
All-to-all personalized exchange in generalized shuffle-exchange networks
Theoretical Computer Science
Scalable mpNoC for massively parallel systems - Design and implementation on FPGA
Journal of Systems Architecture: the EUROMICRO Journal
Unified algebraic theory of sorting, routing, multicasting, and concentration networks
IEEE Transactions on Communications
Parallelism and Array Processing
IEEE Transactions on Computers
Memory Allocations for Multiprocessor Systems That Incorporate Content-Addressable Memories
IEEE Transactions on Computers
Shuffling with the Illiac and PM2I SIMD Networks
IEEE Transactions on Computers
A Comparative Study of Distributed Resource Sharing on Multiprocessors
IEEE Transactions on Computers
A Classification of Cube-Connected Networks with a Simple Control Scheme
IEEE Transactions on Computers
A theory of decomposition into prime factors of layered interconnection networks
Discrete Applied Mathematics
IEEE Transactions on Computers - Special issue on parallel processors and processing
Journal of Computer Systems, Networks, and Communications
Reliability analysis of multi-path multi-stage interconnection network
ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
Design and implementation of Multistage Interconnection Networks using Quantum-dot Cellular Automata
Microelectronics Journal
ISPA'05 Proceedings of the Third international conference on Parallel and Distributed Processing and Applications
ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications
Towards UML 2 extensions for compact modeling of regular complex topologies
MoDELS'05 Proceedings of the 8th international conference on Model Driven Engineering Languages and Systems
Analyzing permutation capability of multistage interconnection networks with colored Petri nets
Information Sciences: an International Journal
Computer generation of efficient software viterbi decoders
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Dynamic routing of data stream tuples among parallel query plan running on multi-core processors
Distributed and Parallel Databases
Research: MS4 - a high performance output buffering ATM switch
Computer Communications
A fault-tolerant 2*2 switching element for switching networks
Computer Communications
A high-performance ATM switch based on modified shuffle-exchange network
Computer Communications
A mathematical abstraction of the rearrangeability conjecture for shuffle-exchange networks
Operations Research Letters
An optimal parallel prefix-sums algorithm on the memory machine models for GPUs
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
A method of batching conflict routings in shuffle-exchange networks
Theoretical Computer Science
Hi-index | 15.19 |
This paper discusses the design of a primary memory system for an array processor which allows parallel, conflict-free access to various slices of data (e.g., rows, columns, diagonals, etc.), and subsequent alignment of these data for processing. Memory access requirements for an array processor are discussed in general terms and a set of common requirements are defined. The ability to meet these requirements is shown to depend on the number of independent memory units and on the mapping of the data in these memories. Next, the need to align these data for processing is demonstrated and various alignment requirements are defined. Hardware which can perform this alignment function is discussed, e.g., permutation, indexing, switching or sorting networks, and a network (the omega network) based on Stone's shuffle-exchange operation [1] is presented. Construction of this network is described and many of its useful properties are proven. Finally, as an example of these ideas, an array processor is shown which allows conflict-free access and alignment of rows, columns, diagonals, backward diagonals, and square blocks in row or column major order, as well as certain other special operations.