Optimum Broadcasting and Personalized Communication in Hypercubes

Authors:
S. Lennart Johnsson;Ching-Tien Ho
Affiliations:
Yale Univ., Cambridge, MA;Yale Univ., Cambridge, MA
Venue:
IEEE Transactions on Computers
Year:
1989

Citing 10
Cited 214

Complexity issues in VLSI: optimal layouts for the shuffle-exchange graph and other networks

Complexity issues in VLSI: optimal layouts for the shuffle-exchange graph and other networks
The cosmic cube

Communications of the ACM - Special section on computer architecture
Communication effect basic linear algebra computations on hypercube architectures

Journal of Parallel and Distributed Computing
Solving tridiagonal systems on ensemble architectures

SIAM Journal on Scientific and Statistical Computing
Hypercube algorithms and implementations

SIAM Journal on Scientific and Statistical Computing
Algorithms for matrix transposition on Boolean N-cube configured ensemble architecture

SIAM Journal on Matrix Analysis and Applications
The connection machine

The connection machine
The Design and Analysis of Computer Algorithms

The Design and Analysis of Computer Algorithms
Universal schemes for parallel communication

STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing
Combinatorial Algorithms: Theory and Practice

Combinatorial Algorithms: Theory and Practice

Expressing Boolean cube matrix algorithms in shared memory primitives

C3P Proceedings of the third conference on Hypercube concurrent computers and applications - Volume 2
Dilation d embedding of a hyper-pyramid into a hypercube

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Element order and convergence rate of the conjugate gradient method for data parallel stress analysis

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Adaptive Fault-Tolerant Routing in Hypercube Multicomputers

IEEE Transactions on Computers
An optional hypercube direct N-body solver on the connection machine

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
The efficiency of greedy routing in hypercubes and butterflies

SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
Enhanced Hypercubes

IEEE Transactions on Computers
Reliable broadcast algorithms for HARTS

ACM Transactions on Computer Systems (TOCS)
Efficient Tridiagonal Solvers on Multicomputers

IEEE Transactions on Computers
Asymptotically Optimal Broadcasting and Gossiping in Faulty Hypercube Multicomputers

IEEE Transactions on Computers
Scheduling regular and irregular communication patterns on the CM-5

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
ComPaSS: efficient communication services for scalable architectures

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
The Cost of Broadcasting on Star Graphs and k-ary Hypercubes

IEEE Transactions on Computers
Efficient collective data distribution in all-port wormhole-routed hypercubes

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Embedding hyperpyramids into hypercubes

IBM Journal of Research and Development
Multiscattering on the cube-connected cycles

Parallel Computing
An architecture for optimal all-to-all personalized communication

SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Communication efficient matrix multiplication on hypercubes

SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Efficient algorithms for all-to-all communications in multi-port message-passing systems

SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Adaptive Deadlock- and Livelock-Free Routing with All Minimal Paths in Torus Networks

IEEE Transactions on Parallel and Distributed Systems
Unicast-Based Multicast Communication in Wormhole-Routed Networks

IEEE Transactions on Parallel and Distributed Systems
Static and Run-Time Algorithms for All-to-Many Personalized Communication on Permutation Networks

IEEE Transactions on Parallel and Distributed Systems
Optimal NODUP All-to-All Broadcast Schemes in Distributed Computing Systems

IEEE Transactions on Parallel and Distributed Systems
Dynamic Broadcasting in Parallel Computing

IEEE Transactions on Parallel and Distributed Systems
CCL: A Portable and Tunable Collective Communication Library for Scalable Parallel Computers

IEEE Transactions on Parallel and Distributed Systems
Optimal Broadcast in All-Port Wormhole-Routed Hypercubes

IEEE Transactions on Parallel and Distributed Systems
An Optimal Broadcasting Algorithm without Message Redundancy in Star Graphs

IEEE Transactions on Parallel and Distributed Systems
Performance predictions for parallel diagonal-implicitly iterated Runge-Kutta methods

PADS '95 Proceedings of the ninth workshop on Parallel and distributed simulation
LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
A near-optimal broadcasting algorithm in all-port wormhole-routed hypercubes

ICS '95 Proceedings of the 9th international conference on Supercomputing
Circuit-Switched Broadcasting in Torus Networks

IEEE Transactions on Parallel and Distributed Systems
On the Design and Implementation of Broadcast and Global Combine Operations Using the Postal Model

IEEE Transactions on Parallel and Distributed Systems
On General Results for All-to-All Broadcast

IEEE Transactions on Parallel and Distributed Systems
The Extended Cube Connected Cycles: An Efficient Interconnection for Massively Parallel Systems

IEEE Transactions on Computers
A Trip-Based Multicasting Model in Wormhole-Routed Networks with Virtual Channels

IEEE Transactions on Parallel and Distributed Systems
Folded Petersen Cube Networks: New Competitors for the Hypercubes

IEEE Transactions on Parallel and Distributed Systems
Short communication: Communication-efficient matrix multiplication on hypercubes

Parallel Computing
Efficient Termination Detection for Loosely Synchronous Applications in Multicomputers

IEEE Transactions on Parallel and Distributed Systems
Practical parallel algorithms for personalized communication and integer sorting

Journal of Experimental Algorithmics (JEA)
Hybrid algorithms for complete exchange in 2D meshes

ICS '96 Proceedings of the 10th international conference on Supercomputing
A Broadcast Algorithm for All-Port Wormhole-Routed Torus Networks

IEEE Transactions on Parallel and Distributed Systems
Balanced Spanning Trees in Complete and Incomplete Star Graphs

IEEE Transactions on Parallel and Distributed Systems
Practical aspects and experiences Scalable massively parallel algorithms for computational nanoelectronics

Parallel Computing
Evaluating uniform expressions within two steps of minimum parallel time

Journal of the ACM (JACM)
Optimal Polling in Communication Networks

IEEE Transactions on Parallel and Distributed Systems
Modeling parallel bandwidth: local vs. global restrictions

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Approximation algorithms for structured communication problems

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Embedding of Generalized Fibonacci Cubes in Hypercubes with Faulty Nodes

IEEE Transactions on Parallel and Distributed Systems
All-to-All Broadcasting in Faulty Hypercubes

IEEE Transactions on Computers
Distributed shared memory systems with improved barrier synchronization and data transfer

ICS '97 Proceedings of the 11th international conference on Supercomputing
Performance considerations in software multicasts

ICS '97 Proceedings of the 11th international conference on Supercomputing
Conflict-free template access in k-ary and binomial trees

ICS '97 Proceedings of the 11th international conference on Supercomputing
A Dilated-Diagonal-Based Scheme for Broadcast in a Wormhole-Routed 2D Torus

IEEE Transactions on Computers
Generating an Efficient Broadcast Sequence Using Reflected Gray Codes

IEEE Transactions on Parallel and Distributed Systems
Efficient Algorithms for All-to-All Communications in Multiport Message-Passing Systems

IEEE Transactions on Parallel and Distributed Systems
Scheduling time-constrained communication in linear networks

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
All-To-All Broadcast and Matrix Multiplication in Faulty SIMD Hypercubes

IEEE Transactions on Parallel and Distributed Systems
A Theory for Total Exchange in Multidimensional Interconnection Networks

IEEE Transactions on Parallel and Distributed Systems
Scalable S-To-P Broadcasting on Message-Passing MPPs

IEEE Transactions on Parallel and Distributed Systems
Optimized Broadcasting and Multicasting Protocols in Cut-Through Routed Networks

IEEE Transactions on Parallel and Distributed Systems
Macro-Star Networks: Efficient Low-Degree Alternatives to Star Graphs

IEEE Transactions on Parallel and Distributed Systems
Efficient Broadcast and Multicast on Multistage Interconnection Networks Using Multiport Encoding

IEEE Transactions on Parallel and Distributed Systems
Realizing Common Communication Patterns in Partitioned Optical Passive Stars (POPS) Networks

IEEE Transactions on Computers
Efficient Broadcasting in Wormhole-Routed Multicomputers: A Network-Partitioning Approach

IEEE Transactions on Parallel and Distributed Systems
Multidestination Message Passing in Wormhole k-ary n-cube Networks with Base Routing Conformed Paths

IEEE Transactions on Parallel and Distributed Systems
Embedding and Reconfiguration of Spanning Trees in Faulty Hypercubes

IEEE Transactions on Parallel and Distributed Systems
Multiple Multicast with Minimized Node Contention on Wormhole k-ary n-cube Networks

IEEE Transactions on Parallel and Distributed Systems
On scheduling all-to-all personalized connections and cost-effective designs in WDM rings

IEEE/ACM Transactions on Networking (TON)
Fault-Tolerant Communication Algorithms in Toroidal Networks

IEEE Transactions on Parallel and Distributed Systems
Achieving Fault-Tolerant Multicast in Injured Wormhole-Routed Tori and Meshes Based on Euler Path Construction

IEEE Transactions on Computers
Optimal All-to-All Personalized Exchange in Self-Routable Multistage Networks

IEEE Transactions on Parallel and Distributed Systems
Recursive Cube of Rings: A New Topology for Interconnection Networks

IEEE Transactions on Parallel and Distributed Systems
A Transformation Approach to Derive Efficient Parallel Implementations

IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
Scatter and gather operations on an asynchronous communication model

SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 2
Deriving Array Distributions by Optimization Techniques

The Journal of Supercomputing
HPFBench: a high performance Fortran benchmark suite

ACM Transactions on Mathematical Software (TOMS)
Configurable Algorithms for Complete Exchange in 2D Meshes

IEEE Transactions on Parallel and Distributed Systems
Optimally Balanced Spanning Tree of the Star Network

IEEE Transactions on Computers
All-to-All Personalized Communication in Multidimensional Torus and Mesh Networks

IEEE Transactions on Parallel and Distributed Systems
A data-parallel implementation of O(N) hierarchical N-body methods

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Energy-Efficient Permutation Routing in Radio Networks

IEEE Transactions on Parallel and Distributed Systems
Optimal All-to-All Personalized Exchange in a Class of Optical Multistage Networks

IEEE Transactions on Parallel and Distributed Systems
Efficient Multicast on Irregular Switch-Based Cut-Through Networks with Up-Down Routing

IEEE Transactions on Parallel and Distributed Systems
Pipelined All-to-All Broadcast in All-Port Meshes and Tori

IEEE Transactions on Computers
Hybrid Algorithms for Complete Exchange in 2D Meshes

IEEE Transactions on Parallel and Distributed Systems
Near-Optimal All-to-All Broadcast in Multidimensional All-Port Meshes and Tori

IEEE Transactions on Parallel and Distributed Systems
Nonlockability in Multirings and Hypercubes at Serial Transmission of Data Blocks

Automation and Remote Control
Efficient and scalable cache coherence schemes for shared memory hypercube multiprocessors

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Achieving Robustness and Minimizing Overhead in Parallel Algorithms Through Overlapped Communication/Computation

The Journal of Supercomputing - Special issue on embedded fault-tolerance systems
Balancing Contention and Synchronization on the Intel Paragon

IEEE Parallel & Distributed Technology: Systems & Technology
Collective Communication in Wormhole-Routed Massively Parallel Computers

Computer
Scattering and Gathering Messages in Networks of Processors

IEEE Transactions on Computers
Broadcasting on Incomplete Hypercubes

IEEE Transactions on Computers
Safety Levels-An Efficient Mechanism for Achieving Reliable Broadcasting in Hypercubes

IEEE Transactions on Computers
Edge-Disjoint Spanning Trees on the Star Network with Applications to Fault Tolerance

IEEE Transactions on Computers
Toward Optimal Broadcast in a Star Graph Using Multiple Spanning Trees

IEEE Transactions on Computers
Depth-First Search Approach for Fault-Tolerant Routing in Hypercube Multicomputers

IEEE Transactions on Parallel and Distributed Systems
Optimal Broadcasting on the Star Graph

IEEE Transactions on Parallel and Distributed Systems
Reduction Operations on a Distributed Memory Machine with a Reconfigurable Interconnection Network

IEEE Transactions on Parallel and Distributed Systems
Multinode Broadcast in Hypercubes and Rings with Randomly Distributed Length of Packets

IEEE Transactions on Parallel and Distributed Systems
Balanced Parallel Sort on Hypercube Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Algorithms and Bounds for Shortest Paths and Diameter in Faulty Hypercubes

IEEE Transactions on Parallel and Distributed Systems
Efficient Routing Schemes for Multiple Broadcasts in Hypercubes

IEEE Transactions on Parallel and Distributed Systems
The Scalability of FFT on Parallel Computers

IEEE Transactions on Parallel and Distributed Systems
The Hyper-deBruijn Networks: Scalable Versatile Architecture

IEEE Transactions on Parallel and Distributed Systems
The Hierarchical Hypercube: A New Interconnection Topology for Massively Parallel Systems

IEEE Transactions on Parallel and Distributed Systems
A Comparative Study of Topological Properties of Hypercubes and Star Graphs

IEEE Transactions on Parallel and Distributed Systems
Analysis of Asynchronous Polynomial Root Finding Methods on a Distributed Memory Multicomputer

IEEE Transactions on Parallel and Distributed Systems
Deadlock-Free Multicast Wormhole Routing in 2-D Mesh Multicomputers

IEEE Transactions on Parallel and Distributed Systems
Concurrent Processing of Linearly Ordered Data Structures on Hypercube Multicomputers

IEEE Transactions on Parallel and Distributed Systems
A Scalable Parallel Formulation of the Backpropagation Algorithm for Hypercubes and Related Architectures

IEEE Transactions on Parallel and Distributed Systems
Computing Global Combine Operations in the Multiport Postal Model

IEEE Transactions on Parallel and Distributed Systems
All-To-All Communication with Minimum Start-Up Costs in 2D/3D Tori and Meshes

IEEE Transactions on Parallel and Distributed Systems
Optimal Total Exchange in Cayley Graphs

IEEE Transactions on Parallel and Distributed Systems
On the Asymptotical Optimality of Multilayered Decentralized Consensus Protocol

IEEE Transactions on Parallel and Distributed Systems
Edges-disjoint spanning trees on the binary wrapped butterfly network with applications to fault tolerance

Parallel Computing
Join and Data Redistribution Algorithms for Hypercubes

IEEE Transactions on Knowledge and Data Engineering
On Some Global Operations in Faulty SIMD Hypercubes

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Algorithms for All-to-All Personalized Exchange in 2D and 3D Tori

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
DPF: A Data Parallel Fortran Benchmark Suite

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Deadlock-Free Fault-tolerant Routing in the Multi-dimensional Crossbar Network and Its Implementation for the Hitachi SR2201

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Radar Signal Processing Using Pipelines Optical Hypercube Interconnects

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Improved One-to-All Broadcasting Algorithms on Faulty SIMD Hypercubes

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
On the VLSI Area and Bisection Width of Star Graphs and Hierarchical Cubic Networks

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
HiHCoHP: Toward a Realistic Communication Model for Hierarchical HyperClusters of Heterogeneous Processors

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Simple Incremental Network Topology for Wormhole Switch-Based Networks

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Recursion-Based Broadcast Paradigm in Wormhole Routed Mesh/Torus Networks

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Routing and Embeddings in Super Cayley Graphs

PaCT '999 Proceedings of the 5th International Conference on Parallel Computing Technologies
An Experimental Assessment of Express Parallel Programming Environment

MASCOTS '95 Proceedings of the 3rd International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
Software Implemented Fault Tolerance in Hypercube

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
A Bandwidth Latency Tradeoff for Broadcast and Reduction

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Broadcasting in Generalized de Bruijn Digraphs

COCOON '02 Proceedings of the 8th Annual International Conference on Computing and Combinatorics
How To Share a Divisible Load in a Hypercube

ParNum '99 Proceedings of the 4th International ACPC Conference Including Special Tracks on Parallel Numerics and Parallel Computing in Image Processing, Video Processing, and Multimedia: Parallel Computation
All-to-All Personalized Communication Algorithms in Chordal Ring Networks

ICN '01 Proceedings of the First International Conference on Networking-Part 2
A bandwidth latency tradeoff for broadcast and reduction

Information Processing Letters
Constructing Edge-Disjoint Spanning Trees in Product Networks

IEEE Transactions on Parallel and Distributed Systems
Fault-Tolerant Broadcasting in 2-D Wormhole-Routed Meshes

The Journal of Supercomputing
Broadcasting on incomplete star graph interconnection networks

HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
A journey into multicomputer routing algorithms

PAS '95 Proceedings of the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis
Node-ranking schemes for the star networks

Journal of Parallel and Distributed Computing
All-to-all personalized communication on multistage interconnection networks

Discrete Applied Mathematics
An analytical model of wormhole-routed hypercubes under broadcast traffic

Performance Evaluation
Optimal broadcasting in injured hypercubes using directed safety levels

Journal of Parallel and Distributed Computing - Special section best papers from the 2002 international parallel and distributed processing symposium
Embedding k(n - k) edge-disjoint spanning trees in arrangement graphs

Journal of Parallel and Distributed Computing
Efficient Collective Communications in Dual-Cube

The Journal of Supercomputing
Optimal all-ports collective communication algorithms for the k-ary n-cube interconnection networks

Journal of Systems Architecture: the EUROMICRO Journal
Efficient orchestration of sub-word parallelism in media processors

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
All-port total exchange in cartesian product networks

Journal of Parallel and Distributed Computing
Pipelining Broadcasts on Heterogeneous Platforms

IEEE Transactions on Parallel and Distributed Systems
Broadcast Trees for Heterogeneous Platforms

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Message Scheduling for All-to-All Personalized Communication on Ethernet Switched Clusters

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Hierarchical star: a new two level interconnection network

Journal of Systems Architecture: the EUROMICRO Journal
Optimal broadcasting on incomplete star graph interconnection networks

Journal of Systems Architecture: the EUROMICRO Journal
Efficient trigger-broadcasting in heterogeneous clusters

Journal of Parallel and Distributed Computing
A Recursion-Based Broadcast Paradigm in Wormhole Routed Networks

IEEE Transactions on Parallel and Distributed Systems
Collective communication on architectures that support simultaneous communication over multiple links

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Edge-disjoint spanning trees for the generalized butterfly networks and their applications

Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part II
Research note: Improved one-to-all broadcasting algorithms on faulty SIMD hypercubes

Journal of Parallel and Distributed Computing
Distributed Algorithms for Building Hamiltonian Cycles in k-ary n-cubes and Hypercubes with Faulty Links

ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Generalized methods for algorithm development on optical systems

The Journal of Supercomputing
Parallel construction of optimal independent spanning trees on hypercubes

Parallel Computing
Continuum: A Hybrid Time/Space Communications Paradigm for k-ary n-cubes

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Efficient Routing and Broadcasting in Recursive Interconnection Networks

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
A Message Scheduling Scheme for All-to-All Personalized Communication on Ethernet Switched Clusters

IEEE Transactions on Parallel and Distributed Systems
One-to-all personalized communication in torus networks

PDCN'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: parallel and distributed computing and networks
Wildcard Search in Structured Peer-to-Peer Networks

IEEE Transactions on Knowledge and Data Engineering
A parallel routing algorithm on recursive cube of rings networks employing Hamiltonian circuit Latin square

Information Sciences: an International Journal
On the Relationship between Caching and Routing in DHTs

WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
Optimal all-to-all personalised exchange in a novel optical multistage interconnection network

International Journal of High Performance Computing and Networking
Techniques for pipelined broadcast on ethernet switched clusters

Journal of Parallel and Distributed Computing
Mapping pipeline skeletons onto heterogeneous platforms

Journal of Parallel and Distributed Computing
Optimal broadcast for fully connected processor-node networks

Journal of Parallel and Distributed Computing
Network load-aware content distribution in overlay networks

Computer Communications
Constructing edge-disjoint spanning trees in locally twisted cubes

Theoretical Computer Science
Bandwidth efficient all-to-all broadcast on switched clusters

International Journal of Parallel Programming
Optimal node-selection algorithm for parallel download in overlay content-distribution networks

Computer Networks: The International Journal of Computer and Telecommunications Networking
Simple and fast IP lookups using binomial spanning trees

Computer Communications
Process cooperation in multiple message broadcast

Parallel Computing
Two-tree algorithms for full bandwidth broadcast, reduction and scan

Parallel Computing
Scheduling for atomic broadcast operation in heterogeneous networks with one port model

The Journal of Supercomputing
Paper: Hierarchical spanning trees and distributing on incomplete hypercubes

Parallel Computing
Independent spanning trees vs. edge-disjoint spanning trees in locally twisted cubes

Information Processing Letters
HyperCuP: hypercubes, ontologies, and efficient search on peer-to-peer networks

AP2PC'02 Proceedings of the 1st international conference on Agents and peer-to-peer computing
Assessing contention effects on MPI_alltoall communications

GPC'07 Proceedings of the 2nd international conference on Advances in grid and pervasive computing
Optimum broadcasting algorithms in (n, k)-star graphs using spanning trees

NPC'07 Proceedings of the 2007 IFIP international conference on Network and parallel computing
The recursive network-based routing algorithm

SEPADS'10 Proceedings of the 9th WSEAS international conference on Software engineering, parallel and distributed systems
Optimal bucket algorithms for large MPI collectives on torus interconnects

Proceedings of the 24th ACM International Conference on Supercomputing
Constructing edge-disjoint spanning trees in twisted cubes

Information Sciences: an International Journal
Efficient RDMA-based multi-port collectives on multi-rail QsNetII clusters

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Pipelined broadcast on ethernet switched clusters

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Efficient migration of service agent in P-grid environments based on mobile agent

TELE-INFO'06 Proceedings of the 5th WSEAS international conference on Telecommunications and informatics
Mutually independent Hamiltonian cycles in k-ary n-cubes when k is even

Computers and Electrical Engineering
Broadcasting secure messages via optimal independent spanning trees in folded hypercubes

Discrete Applied Mathematics
Improving communication performance in dense linear algebra via topology aware collectives

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Independent spanning trees on twisted cubes

Journal of Parallel and Distributed Computing
Application of the hamiltonian circuit latin square to the parallel routing algorithm on 2-circulant networks

CIS'04 Proceedings of the First international conference on Computational and Information Science
Efficient allgather for regular SMP-Clusters

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
A parallel routing algorithm on circulant networks employing the hamiltonian circuit latin square

NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
An optimal broadcast algorithm adapted to SMP clusters

PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Optimal broadcast for fully connected networks

HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
A parallel routing algorithm on 2-circulant networks employing the hamiltonian circuit latin square

PDCAT'04 Proceedings of the 5th international conference on Parallel and Distributed Computing: applications and Technologies
Topological properties of folded hyper-star networks

The Journal of Supercomputing
Edge-disjoint hamiltonian cycles of WK-recursive networks

PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
A parallel routing algorithm on circulant networks employing the Hamiltonian circuit latin square

Information Sciences: an International Journal
Mutually independent hamiltonian cycles of binary wrapped butterfly graphs

Mathematical and Computer Modelling: An International Journal
Modelling and analysis of communication overhead for parallel matrix algorithms

Mathematical and Computer Modelling: An International Journal
An algorithm for routing messages between processing elements in a multiprocessor system which tolerates a maximal number of faulty links

Mathematical and Computer Modelling: An International Journal
One-to-many node-disjoint paths of hyper-star networks

Discrete Applied Mathematics
Hypercube connected rings: a scalable and fault-tolerant logical topology for optical networks

Computer Communications
FFTs and multiple collective communication on multiprocessor-node architectures

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part I
An efficient parallel construction of optimal independent spanning trees on hypercubes

Journal of Parallel and Distributed Computing
Process cooperation in multiple message broadcast

PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
An algorithm to construct independent spanning trees on parity cubes

Theoretical Computer Science
On the maximum number of fault-free mutually independent Hamiltonian cycles in the faulty hypercube

Journal of Combinatorial Optimization

Quantified Score

Hi-index	15.04

Visualization

Abstract

Four different communication problems are addressed in Boolean n-cube configured multiprocessors: (1) one-to-all broadcasting: distribution of common data from a single source to all other nodes; (2) one-to-all personalized communication: a single node sending unique data to all other nodes; (3) all-to-all broadcasting: distribution of common data from each node to all other nodes; and (4) all-to-all personalized communication: each node sending a unique piece of information to every other node. Three communication graphs (spanning trees) for the Boolean n-cube are proposed for the routing, and scheduling disciplines provably optimum within a small constant factor are proposed. With appropriate scheduling and concurrent communication on all ports of every processor, routings based on these two communication graphs offer a speedup of up to n/2, and O( square root n) over the routings based on the spanning binomial tree for cases (2)-(4) respectively. All three spanning trees offer optimal communication times for cases (2)-(4) and concurrent communication on all ports of every processor. Timing models and complexity analysis are verified by experiments on a Boolean-cube-configured multiprocessor.